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Value Recursion in Monadic Computations 

Levent Erkok 

Ph.D., OGI School of Science and Engineering, 
Oregon Health and Science University 
October 2002 

Thesis Advisor: Dr. John Launchbury 



This thesis addresses the interaction between recursive declarations and computational 
effects modeled by monads. More specifically, we present a framework for modeling cyclic 
definitions resulting from the values of monadic actions. We introduce the term value 
recursion to capture this kind of recursion. 

Our model of value recursion relies on the existence of particular fixed-point operators 
for individual monads, whose behavior is axiomatized via a number of equational prop- 
erties. These properties regulate the interaction between monadic effects and recursive 
computations, giving rise to a characterization of the required recursion operation. We 
present a collection of such operators for monads that are frequently used in functional 
programming, including those that model exceptions, non-determinism, input-output, and 
stateful computations. 

In the context of the programming language Haskell, practical applications of value 
recursion give rise to the need for a new language construct, providing support for re- 
cursive monadic bindings. We discuss the design and implementation of an extension to 
Haskell's do-notation which allows variables to be bound recursively, eliminating the need 
for programming with explicit fixed-point operators. 
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Chapter 1 



Introduction 



This thesis addresses the interaction between two fundamental notions in programming 
languages: Recursion and effects. Recursion is the essence of cyclic definitions, both for 
recursive functions and circular data structures. Effects are the essence of computational 
features, including I/O, exceptions, and stateful computations. Although both notions 
have been studied extensively on their own, their interaction has received relatively little 
attention. 

1.1 Recursion and effects 

In the traditional domain theoretic setting, the denotational semantics of recursive def- 
initions are understood in terms of fixed-points of continuous functions. That is, the 
semantics of a definition of the form x = f x is taken to be the least fixed-point of the 
map corresponding to / [82, 83]. The same principle works for both recursive functions 
and circular data structures, a rather pleasing situation. 

Handling of effects in the denotational framework, however, proved to be much more 
problematic, often summed up by the phrase "denotational semantics is not modular" |53[ 
64]. Briefly, addition of new effects require substantial changes to the existing semantic 
description. For instance, exceptions can be modeled by adding a special failure element to 
each domain, representing the result of a failed computation. But then, even such a simple 
thing as the meaning of an arithmetic operation requires a messy denotational description; 
one needs to check for failure at each argument, and propagate accordingly. The story is 
similar for other cases, including I/O and assignments, two of the most "popular" effects 
found in many programming languages [761 177] . 

It was Moggi's influential work on monads that revolutionized the semantic treatment 
of effects, which he referred to as notions of computation. Moggi showed how monads can 
be used to model programming language features in a uniform way, providing an abstract 
view of programming languages [62 , 63] . In the monadic framework, values of a given type 



1 



2 



are distinguished from computations that yield values of that type. Since the monadic 
structure hides the details of how computations are internally represented and composed, 
programmers and language designers work in a much more flexible environment. This 
flexibility is a huge win over the traditional approach, where everything has to be explicit. 

Perhaps what Moggi did not quite envision was the response from the functional pro- 
gramming community, who took the idea to heart. Wadler wrote a series of articles 
showing how monads can be used in structuring functional programs themselves, not just 
the underlying semantics |89j [91] . Very quickly, the Haskell committee adopted monadic 
I/O as the standard means of performing input and output in Haskell, making monads an 
integral part of a modern programming language [68, 69]. The use of monads in Haskell 
is further encouraged by special syntactic support, known as the do-notation [47]. 

As the monadic programming style became more and more popular in Haskell, pro- 
grammers started realizing certain shortcomings. For instance, function application be- 
comes tedious in the presence of effects. Or, the if-then-else construct becomes unsightly 
when the test expression is monadic. However, these are mainly syntactic issues that 
can easily be worked around. More seriously, the monadic sublanguage lacks support for 
recursion over the values of monadic actions. The issue is not merely syntactic; it is sim- 
ply not clear what a recursive definition means when the defining expression can perform 
monadic effects. 

This problem brings us to the subject matter of the present work: Semantics of recur- 
sive declarations in monadic computations. More specifically, our aim is to study recursion 
resulting from the cyclic use of values in monadic actions. We use the term value recursion 
to describe this notion. 

1.2 A motivating example: Modeling circuits using monads 

To illustrate value recursion, we will consider the example that motivated our work in 
the first place: modeling circuits using monads. Microarchitectural design languages have 
been the target of programming language research in recent years, aiming at providing 
better language support for managing the complexity of such designs [121 [58] . Lava \S\ 
and Hawk [491 [59] are two recent systems designed to address this need. In this section, 
we will consider a stripped down version of such a language, embedded in Haskell. 

To familiarize ourselves with the types of circuits we can define, let us first consider 
a simple non-monadic implementation. We represent signals by lists, successive elements 
representing the values at each clock tick. Haskell is already expressive enough to define 
the basic building blocks without much difficulty: 



3 



type Sig a = [a] 

and, xor :: Sig Bool — > 5% i?ooZ — > <fng .Boo/ 
and xs ys = zipWith (&&) xs ys 
xor xs ys = zipWith (7^) xs t/s 

:: Sj<? 5oo/ — > Szg i?oo/ 
inw xs = map noi xs 

deZay :: String — > a — > a — > S'ig a 

deZay _ w xs = w : xs 

The delay element forms a signal that behaves as its second argument during the first 
clock cycle, behaving as its third argument afterwards. (The first argument to delay is 
intended to be a name for v. We will use it later.) Of course, a more realistic example 
would come equipped with multiplexers, registers, etc., but the elements above will be 
sufficient for our purposes. For instance, we can model a half-adder simply by: 

halfAdd :: Sig Bool — > Sig Bool — > (Sig Bool, Sig Bool) 

half Add xs ys = (sum, carry) 

where sum = xor xs ys 
carry = and xs ys 

Here is a sample run: 

Main> halfAdd [True, True] [False, True] 
( [True , False] , [False , True] ) 

As another example, we can create a circuit that toggles its output at each clock tick, 
starting from the value False: 



inp 



DELAY False 



INV 



toggle :: Sig Bool 
toggle = out 

where inp = inv out 

out = delay "False" False inp 



Variables inp and out are defined mutually recursively, corresponding to the feedback 
loop in the circuit diagram. The recursive definition capability of Haskell's where clause 
plays a crucial role in expressing the required cyclic dependency. We have: 



Main> toggle 

[False , True , False , True , False , True , False , True , 



Note that the result is an infinite signal. 
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What can we do with circuit descriptions? Since we model circuits by functions, we 
can pass them around and combine them to build bigger circuits. But, eventually, all we 
can do with a circuit is to simulate it, that is, run it on a particular input. As pointed 
out by Launchbury et al. [49] , and Claessen [12] , this model does not allow for multiple 
interpretations. Ideally, we would like to be able to analyze our circuits, translating 
them to other hardware description languages such as VHDL. Alternatively, we may want 
to render the circuit graphically, obtaining a schematic diagram, or recast the circuit 
description in the language of a theorem prover to let us reason about it. We would like 
our language to be flexible enough to support all of these views. 

The standard way of attacking this problem is to abstract away from any particular 
signal or circuit model, hiding the control flow behind a monad, and basic circuit elements 
behind a type class. Each alternative semantics will be represented as an instance of this 
class, providing new views of circuits. Then, by simply switching to a different monad, 
we will be able to obtain an alternative interpretation without changing existing circuit 
descriptions. Here is one way of capturing the required structured 

class Monad m Circuit m where 

and, xor :: Sig Bool — > Sig Bool — > m (Sig Bool) 

inv :: Sig Bool — ► m (Sig Bool) 

delay :: String — > a — > Sig a — > m (Sig a) 

For instance, the description of the half- adder becomes: 

halfAdd :: Circuit m Sig Bool — > Sig Bool — > m (Sig Bool, Sig Bool) 

halfAdd il i2 = do sum <— xor il i2 

carry <— and il i2 

return (sum, carry) 

Note that the new model of halfAdd is not committed to any particular circuit model, 
or signal data type. It is a generic description of half-adders. To simulate, all we need is 
the identity monad for expressing the control structure, and the list model for signals: 

type Sig a = [a] 

data Simulate a = Sim a deriving Show 

instance Monad Simulate where 
return x = Sim x 
Sim x / = / x 

Unsurprisingly, the Circuit instance for the Simulate monad will simply mimic our 
non-monadic implementation: 

A better alternative would be parameterizing the Circuit class over the Sig type as well, using a 
multiparameter type class. We refrain from doing so, however, for the sake of simplicity. 
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instance Circuit 
and xs ys = 
xor xs ys = 
inv xs = 
delay _ v xs = 

Using this model, we have: 

Main> half Add [True, True] [False, True] :: Simulate ([Bool], [Bool]) 
Sim ( [True, False] , [False, True] ) 

More interestingly, we can consider an alternative semantics which will create a wire- 
by-wire description of a given circuit. In this model, signals will be identified by symbolic 
names. Our monad will have to generate new names for intermediate wires, accumulating 
a textual "drawing" of the circuit as it is built. Hence, we employ a combination of state 
and output monads: 

type Sig a = String 

data Draw a = D (Int — > (a, [String], Int)) 

instance Show a =4> Show (Draw a) where 
show (D f) = let (I, s, _) = / 0 

in concatMap (H+"\n") s -+f "Result: " H+ show I 

instance Monad Draw where 
return x = D (Xi. (x, [], i)) 
D f »= g = D (Xi. let (a, o, i') = f i 

D h = g a 

(b, o', i") = h i' 
in (b, o +\- o', i")) 

We will need the following auxiliary functions: 

new Wire :: Draw String 

newWire = D (Xi. ('w'-.show i, [], i+1)) 

output :: String — > Draw () 
output s = D (Xi. ((), [s], i)) 

item :: String — > String — > String — > Draw String 

item a b c = do n <— newWire 

output (n H+ " = " H+ a +h " " b +f " " 4f c) 

return n 

The function newWire simply returns a new name. (The variable i keeps track of the 
number of wires.) The function output lets us emit intermediate descriptions. Finally, 



Simulate where 
return (zip With (&&) xs ys) 
return (zip With (^) xs ys) 
return (map not xs) 
return (w.xs) 
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item is a generic function for creating a new wire together with a description of how it is 
obtained. Using these auxiliaries, the Circuit instance for the Draw monad becomes:^ 

instance Circuit Draw where 
and a b = item "and" a 
xor a b = item "xor" a 
delay s v a = item "delay" 
inv a = do n <— new 

output (n 
return n 

We have: 

Main> print (halfAdd "a" "b" : : Draw (Sig 
wO = xor a b 
wl = and a b 
Result: ("wO","wl") 

It is worth emphasizing that the description of halfAdd did not change, we simply used 
a different monad. This is the strength of the monadic approach. 

Unfortunately, a similar translation for the toggle circuit does not work. Consider: 

toggle :: Circuit m m {Sig Bool) 
toggle = do inp *— inv out 

out <— delay "False" False inp 

return out 

Although the description perfectly fits the circuit diagram we had before, we have lost the 
feedback loop. The variables inp and out are no longer recursively defined! (In fact, the 
definition above is not even valid Haskell; the variable out is not in scope in the first line.) 
Our non-monadic implementation did not have this problem, as it relied on the recursive 
definition capabilities of Haskell. But now, we are on our own: Haskell does not let us 
write recursive specifications in the presence of monadic effects. 

Unfortunately, the problem is not merely syntactic. It is not clear how to perform this 
kind of recursion at all: we want the values (i.e., the signals) to be defined recursively, but 
we certainly do not want the effects to be repeated or lost (i.e., we do not want to create 
circuit elements repeatedly, or not to create them at all) . We refer to this kind of recursion 
as value recursion. In short, to be able to express the required recursive structure, we need 
the underlying monad to support recursive monadic bindings [18]. Just as the usual fixed- 
point operator handles "normal" recursion, we expect to find value recursion operators, 



b 
b 

s a 
Wire 

-H- " = inv " -H- a) 



Bool, Sig Bool)) 



2 The delay element did not use its first argument in the simulation model, and here it does not use 
the second. The name is irrelevant for simulation, while it is all we need in a textual representation. 
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generically called mfix, mediating the interaction between the underlying effect and the 
recursion operation. 

Getting back to circuit modeling, we will require circuits to be modeled by monads for 
which such fixed-point operators are available, captured by the MonadFix class: 

class Monad m =>■ MonadFix m where 
infix :: (a — > m a) — ► m a 

class MonadFix m => Circuit m where 
-- and, xor, inv, delay as before 

Now, we can tie the recursive knot over inp and out, expressing toggle as follows: 

toggle :: Circuit m m (Sig Bool) 

toggle = mfix (A ~(inp, out), do inp <— inv out 

out <— delay "False" False inp 

return (inp, out)) 
^= X(inp, out), return out 

The final missing piece is the MonadFix instances for Simulate and Draw monads. At 
this point, we ask the reader to simply accept the following definitions: 

instance MonadFix Simulate where 
mfix f = Sim (let Sim a = f a in a) 

instance MonadFix Draw where 
mfix f = D (XL let D g = f a 

(a, s, i') = g i 
in (a, s, i')) 

Note that the Simulate instance is essentially the same as the usual fixed-point opera- 
tor. The Draw instance is a bit more complicated, but the reader can see that we perform 
the fixed-point computation over the variable a, (i.e., the value), passing around i and s 
untouched. Now, to simulate toggle, we just use our Simulate monad: 

Main> toggle : : Simulate (Sig Bool) 

Sim [False , True , False , True , False , True , False , . . . 

and, to get a simple textual drawing, we simply switch to the Draw monad: 

Main> toggle : : Draw (Sig Bool) 

wO = inv wl 

wl = delay False wO 

Result: "wl" 

The handling of recursion via mfix is somewhat mysterious at this point. The whole 
point of this thesis is to expose the mystery, and to explore the interaction between 
recursion and effects, heading toward an equational theory of value recursion. 
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1.3 Recursive monadic bindings 

The use of mfix to tie the recursive knot in a monadic computation is similar to the handling 
of recursive bindings in usual let-expressions. For clarity, we will use the keyword letrec 
here when a binding can be recursive, and let otherwise. In the pure world, we have: 

letrec x = e in e 1 

= let x = fix (Xx. e) in e' 
= (Xx. e') (fix (Xx. e)) 

What happens in a monadic computation? Similar to letrec, let us use the keyword 
mdcj 3 ] for monadic bindings that can be recursive, and do otherwise. We have: 

mdo { x <— e; e' } 
= do { x <— mfix (Ax. e); e' } 
= mfix (Xx. e) ^= Ax. e' 

In Chapter [7J, we will describe an extension to the do-notation of Haskell allowing 
bindings to be recursive, using an enhanced version of this translation. Then, we will be 
able to write the toggle example of the previous section as follows, the compiler taking 
care of the insertion of appropriate calls to mfix: 

toggle = mdo inp <— inv out 

out <— delay "False" False inp 
return out 

There is an opportunity here to clarify a potentially confusing issue about value re- 
cursion. Consider a recursive definition of the form: 

countDown n = if n == 0 

then print "Done!" 
else do print n 

countDown (n—1) 

The intention is clear: Each time countDown is called, we want the effect of printing to 
take place. In this thesis, we will not be dealing with such definitions, as they are already 
explained in terms of the usual fixed-point construction: 

countDown = fix (A/. An. if n == 0 

then print "Done!" 
else do print n 
f (n-1)) 



3 The closest we can get to fido using ASCII. (We would have used dorec, but that is just too long.) Note 
that the use of Haskell-like syntax is just for convenience. We could have used Moggi's letr x <= e in e' 
notation and the keyword letrecr as well [63j . 
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Note that effects are part of countDown 7 s execution, rather than its definition. That 
is, the effect of printing is not performed to determine the meaning of countDown itself. 
In the toggle example, however, we see that effects are part of the definition: They are 
performed in order to determine the values of inp and out, and the cyclic dependence 
gives rise to the need for value recursion. In a sense, the use of recursion and effects in 
countDown are orthogonal, with no interference in between. As shown above, this kind of 
recursion is already explained in terms of fix, the usual fixed-point operator. 

1.4 A generic mfix? 

In Section |1.21 we saw two particular examples of mfix, one for the Simulation monad, 
and another one for the Draw monad. Are these functions actually instances of a generic 
schema? That is, can we find a definition of mfix that will work for all monads, regardless 
of which kind of effect we deal with? Let us pause briefly and consider how one might go 
about defining such a generic operator. 

Recall that the least fixed-point operator on domains satisfies the property: 

fix :: (a — > a) — > a 
fix f =f (fix f) 

which also serves as a definition for fix in a lazy language such as Haskell. One might 
think that a similar defining equation can be found for mfix as well. Indeed, it is not hard 
to generalize to the monadic case: 

mfix :: Monad m^(a—>ma)^ma> 
mfix f = mfix f / 

Note that this definition makes sense for all monads (i.e., it is polymorphic in m). But 
is it a "good" definition? That is, can we use it sensibly to implement value recursion? 

The short answer to this question is, unfortunately, no. To see why not, simply note 
that this definition is equivalent to: 

mfix f = fix (Am. m »= /) = |J {±, .} 

which will diverge whenever the operator is strict in its first argument .0 Furthermore, 
even when 3s= is not strict, this definition will attempt to compute the fixed-point over 

4 Note that the call to mfix f will diverge regardless of what / is. In general, monads based on sum 
types will suffer from this problem, as the ^= operator needs to inspect its first argument to see how to 
proceed. Haskell's maybe and list monads are two popular examples that are based on sum types. Other 
important examples where the S>= operator is strict in its first argument include the frequently used 10 
and strict state monads. 
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both values and effects, which is simply not what we are trying to achieve. In value 
recursion, we want the fixed-point to be computed only over the values, without repeating 
or losing the effects. We will codify what we mean by value recursion in Chapter fusing 
a number of equational properties, exploring the interaction between recursion and effects 
in depth. Then, we will be able to see more clearly why this default definition is not 
appropriate for implementing value recursion. 

1.5 The basic framework and notation 

For most of this thesis we investigate value recursion in the usual domain theoretic seman- 
tics of programming languages, where types are modeled by domains [1771 182]. We write _L T 
for the least element of the domain representing the type r, dropping the subscript when- 
ever unambiguous. Functions are modeled by continuous (and hence monotonic) maps 
between domains, not necessarily strict. Recursion is modeled via least- fixed points. We 
use monads to model effects, following Moggi |63j. Although by no means comprehensive, 
the reader may find it useful to skim over Appendix \A\ which contains a brief review of 
fixed-point operators. 

We expect readers to be familiar with functional programming |35l 187] , particularly 
Haskell J7[ [68]. For the most part, we use Haskell simply as a syntactically beefed up ver- 
sion of A-calculus [30] , so familiarity with any functional language should be sufficient. A 
basic understanding of domain theoretic semantics of programming languages is necessary 
to follow the technical development |76[ [83] . Except for Chapter [6J, we will be mainly inter- 
ested in the "functional programming view" of monads [J, [91] , rather than the categorical 
one pi [55]. Finally, we will have occasion to use the parametricity principle, allowing us 
to derive theorems from the types of polymorphic functions |50[ 75[ [88] . 

Naturally, the theory of value recursion is independent of any particular programming 
language. However, our work is closely tied to Haskell, and we will be careful in pointing 
out the cases when the domain theoretic semantics and the semantics of Haskell do not 
quite match up. The main differences show up in the treatment of products. Since tuples 
are lifted in Haskell, it is not the case that (l,,,!^) = -Lax/3- Therefore, the equality 
x = (tti x, 7T2 x) fails. Similarly, X(x :: a).A-p / -L a ->p, i.e., the function type is lifted too. 
Similar comments apply to sum types as well. Finally, the unit data type has two members, 
_Lq and () itself, that is, it is not really a terminal object. Luckily, these differences do 
not cause much trouble in practice, as long as one is aware of them. We point out the 
cases where the difference becomes significant. 

In our exposition, we will stick to Haskell notation as much as possible, deviating 
from it only for typographical purposes. The difference mainly shows up in compositions 
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and A-bindings. For instance, we will write Haskell's: \f -> \g -> \x -> (f . g) x as 
A/. Xg. Xx. (/ • g) x. 

1.6 Outline of the thesis 

Our aim is to get through the basics of value recursion rather quickly, before we actually 
investigate individual instances. To this end, we use the next two chapters to introduce 
a number of equational properties that govern the behavior of value recursion operators. 
Among these, we will identify three fundamental properties (namely strictness, purity, and 
left shrinking), and in the remainder of the thesis we will consider only those operators 
that satisfy this minimal core. 

Chapters [4] and \5\ are dedicated to the study of individual instances. In Chapter [4j we 
investigate a wide range of monads that are frequently used in functional programming, 
presenting value recursion operators for them. In Chapter \5\ we argue that it is highly 
unlikely that the continuation monad has an associated value recursion operator that will 
satisfy our requirements. 

Chapter [Stakes a step back and looks at a possible categorical theory of value recursion, 
based on the notion of premonoidal categories and traces. Even though the theory of traces 
does not provide a perfect fit, it is illuminating to see how recent work in this area can be 
generalized to capture value recursion for a certain class of monads. 

Chapters [7J and \S\ deal with the Haskell language in particular. In Chapter [7J, we will 
turn our attention to syntactic support for value recursion, presenting a recursive version 
of Haskell's do-notation. In Chapter \5\, we will study Haskell's 10 monad. Since the 10 
monad is hardwired into Haskell, it is not possible to investigate value recursion for it 
directly. Hence, we present a model language (complete with I/O operations and mutable 
variables), and show how one can model value recursion in this world. 

Chapter \9\ presents a number of examples, which, in addition to the circuit modeling 
example of this chapter, provides a tour of potential applications of value recursion. 

Chapter JDJ concludes the thesis with a discussion of related work and future research 
directions. A brief review of fixed-points, along with several proofs that are omitted from 
the main body of the thesis are given in the appendices. 

Each chapter in the remainder of this thesis starts with a brief description of its 
contents. Although we intend the chapters to be read in order, readers may find it useful 
to quickly skim over these segments to determine a particular reading plan according to 
their own interests. 



Chapter 2 
Properties of value recursion operators 



What kinds of properties do we expect value recursion operators to satisfy? So far, we have 
been using phrases like "recursion without repeating or losing effects, " or "recursion only 
over the values" to characterize value recursion. The aim of this chapter is to formalize 
our intuitions by means of equational properties. 

Synopsis. We discuss a number of equivalences that we expect value recursion operators 
to satisfy. These properties range from those that imitate properties of the usual fixed- 
point operator over domains, to those that govern the interaction between recursion and 
effects. We also provide a number of derived properties, including those that are granted by 
virtue of parametricity. Several properties that might be naively expected, yet unsatisfiable 
for a wide range of monads, are discussed as well. 

2.1 Strictness (Nothing from nothing) 

The domain theoretic treatment of recursion in programming languages relies on least 
fixed-points J761 US]- That is, given a specification of the form x = f x, where / :: a — > a, 
we expect x to be the least a satisfying this equation. In this setting, one can show that a 
function is strict if and only if its least fixed-point is _L. Since _L represents no information 
in the domain theoretic ordering, our slogan in this case is simply nothing from nothing. 
Generalizing to value recursion, we expect the following property to hold: 

Property 2.1.1 (Strictness.) Let / :: a — ► m a, 



Remark 2.1.2 In Section 2.6.21 we will be able to derive the right to left implication 
from other properties, i.e., we will show that if mfix f is _L, then / must be strict. We 
prefer expressing the strictness law as it is, however, as it uniquely characterizes strict 
functions of type a — > m a. 



/ -La = -L 



m a 



mfix / = _L 



■m a 
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2.2 Purity (Just like fix) 

Purity formalizes the intuition that infix should behave exactly like fix, in case there are 
no effects: 

Property 2.2.1 (Purity.) Let h :: a — > a, 

infix (return ■ h) = return (fix h) (2-2) 
Diagrammatically, we capture purity as follows: 




Remark 2.2.2 We use wiring diagrams to capture properties pictorially. Note that we 
do not formalize these diagrams, nor use them for any purpose other than illustration. 
Dashed boxes represent where value recursion is performed. Thin lines show data flow. 
The thick line, called the effect line, refers to the details of the monadic computation. 
Although it is not correct to consider the effect line as carrying data, it usually helps to 
think of it as such. (The effect line analogy holds very well for the state monad, but it is 
not very intuitive for, say, the exception monad.) We indicate pure computations by not 
letting them use the effect line, as illustrated by the h box in the above diagram. The 
solid loop on the right hand side indicates the use of fix. (Note that there are no dashed 
boxes on the right hand side as there are no applications of mfix.) 



2.3 Left shrinking (No recursion — No fix) 

Recall our naive translation schema for the recursive do-notation from Section 1.31 Nat- 
urally, we would like mdo to behave exactly like do, provided there are no recursive 
bindings. That is, the following two code fragments should have the same meaning: 

mdo x <— A do x <— A 

B mdo B 

provided A does not make use of x, or any variable defined in the block B. If B does not 
have any recursive bindings either, we can push mdo further down, eventually eliminating 
it altogether. We capture this correspondence by the left shrinking property: 
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Property 2.3.1 (Left shrinking.) Let / :: a — > j3 — > m a, a :: m (3, 

mfix (Xx. a ^= Xy. f x y) = a ^= Ay. mfix (Xx. f x y) (2-3) 
where x does not occur free in a. 

The name "left shrinking" is suggested by the corresponding diagram: 



Remark 2.3.2 The reader might expect an analogous right shrinking property as well. 
But, as we will see in Chapter \3[ arbitrary lifting of computations from the right hand 
side of a 2*= is not possible in general. We can, however, lift pure computations out. We 
will provide a derived law to deal with this case in Section 12.6.31 

2.4 Sliding (Pure mobility) 

Let / :: a — » /3 and h :: (3 — ► a. As reviewed in Appendix \A\ the equation 

fix (h ■ f) = h (fix (f ■ h)) (2.4) 

expresses the dinaturality condition for fix, an extremely important law for manipulating 
fixed-points. We expect value recursion operators to satisfy a similar law as well. 

Two problems arise in translating Equation 12.41 to the world of value recursion. The 
order of / and h is swapped, and h is duplicated on the right hand side. Obviously, if / 
and h can both perform effects, swapping and duplication are both out of question. When 
h is pure, however, we expect to be able to slide it over /: 

Property 2.4.1 (Sliding.) Let / :: a —>■ m (3, h :: f3 — > a, 

f (h _L) = / _L =>■ mfix (map h ■ f) = map h (mfix (f ■ h)) (2-5) 

where map :: (a — > b) — > m a — > m b is the usual lifting function.^ The consequent 
can be equivalently expressed as: 

mfix (Xx. f x ^= return ■ h) = mfix (f ■ h) ^= return ■ h (2-6) 

x The function map is defined by the equation map f m = m ^= return ■ f. Note that, in Haskell 
notation map is called fmap, and the name map is reserved to be used with the list monad only [68]. 
Deviating from Haskell, we use the name map consistently for all monads. 
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Diagrammatically: 




The side condition, i.e., / • h and / should agree on _L, is essential. When we think of 
recursion as an iterative process that starts with _L, we see that / first receives _L on the 
left hand side in the recursive loop, but receives h _L on the right. If h _L 7^ JL, / will have 
more information to start with on the right hand side. The side condition guarantees that 
this extra knowledge is irrelevant: / must not distinguish between _L and h JL. It is worth 
noting that dinaturality of fix (Equation 2.41 ) does not require any such conditions. As 
we will see in Chapter \3\ however, without the side condition, sliding is unsatisfiable for 
many practical monads of interest. 

Observation 2.4.2 The side condition is trivially satisfied if h is strict. It turns out 
that this particular case is derivable from parametricity (see Corollary |2. 6.121 ). 

Note The alert reader will note that the order of effects does not matter for commutative 
monads, and hence one might expect a swapping property where both computations are 
effectful. This is indeed the case, see Section 13.31 for details. 



2.5 Nesting (Two for the price of one) 

Bekic's property for usual fixed-points states that simultaneous recursion over multiple 
variables is equivalent to recursion over one variable at a time (see Appendix |A~l ) In the 
value recursion world, one way to express this relation is to assert the equivalence of the 
following two expressions: 

mdo x <— / (x, x) 

return x 



Property 2.5.1 {Nesting.) Let / :: (a, a) — > m a, 

mfix (Ax. mfix (Xy. f (x, y))) = mfix (Ax. f (x, x)) (2-7) 



mdo x <— mdo y <— / (x, y) 

return y 

return x 



The nesting property^ stipulates this equivalence: 



2 This property was first suggested to us by Ross Paterson (personal communication). 
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The following proposition states an equivalent form of nesting, which is quite useful in 
symbolic manipulations: 

Proposition 2.5.2 Let / :: (r, a) — > m (r,a). Assuming true products, the equation 

mfix (X(x, _). mfix (A(_, y). f (x, yj)) = infix f (2.8) 



is satisfied exactly when nesting holds, provided mfix satisfies the sliding property. 
Proof See Appendix IB. 11 

Using Equation 12.81 it is easy to describe nesting diagrammatically: 



□ 





/ 





































Just like the Bekic property for fix, nesting generalizes to any number of variables. For 
instance, one can derive: 



mfix (X(x, _, _). mfix (A(_, y, _). mfix (A(_, _, z). f (x, y, z)))) 
= mfix (A(x, y, z). f (x, y, z)) 



(2.9) 



Note that the order of nesting is also immaterial, we could have recursed over any permu- 
tation of the variables; for instance, first over z, then x and finally y, etc. 

Remark 2.5.3 We will take a closer look at Equation 12.81 in the case of lifted products 
(as in Haskell). Assuming mfix satisfies strictness, the left hand side will always be _L, due 
to strict matching against pairs. Using irrefutable patterns, one might attempt: 

mfix {X~(x, _). mfix (A~(_, y), f (x, y))) = mfix f 
However, a problem still remains. If / is strict, then the right hand side will be _L, but 



the left hand side might produce an answer, because _L ^ (_L,_L).| 
expressing Equation 12.81 with lifted products is: 



The proper way of 



mfix (A~(x, _). mfix (A~(_, / ( x , v))) = m fi x (A~(z, y)- f (x, y)) (2.10) 
Similar to Proposition 12.5.21 one can establish: 

Proposition 2.5.4 In the case of lifted products, Equation 2.71 is equivalent to Equa- 
tion 2.101 provided mfix satisfies sliding. □ 



3 Peter Thiemann was first to notice this problem (personal communication). 
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2.6 Derived properties 

One can derive new equalities using the properties we have described so far, and proper- 
ties of the underlying domain-theoretic framework. This section presents a collection of 
such laws — those that we have found to be the most useful when reasoning about value 
recursion. 

2.6.1 Constant functions 

Left shrinking and purity properties imply an expected property of fixed-point operators: 
If the fixed-point variable is not used, recursion is irrelevant: 

Proposition 2.6.1 Let a :: m a be a constant (i.e., x does not occur free in a). Then, 

mfix {Xx.a) = a (2-H) 

provided mfix satisfies purity and left shrinking laws. 
Proof 

mfix (Xx. a) = mfix (Xx. a ^= Xy. return y) 
= a Xy. mfix {Xx. return y) 
= a Xy. return {fix {Xx. y)) 
= a Xy. return y 
= a 

Note that fix {Xx. y) = (Ax. y) {fix {Xx. y)) = y. □ 
The diagram in this case is trivial: 




Similarly, we can lift a conditional expression from inside an mfix, if the test expression 
is not involved in the recursion computation: 

Proposition 2.6.2 Let a be a boolean expression where x does not occur free in a. Let 
f,g::a—>-ma. We have: 

mfix {Xx. if a then / x else g x) = if a then mfix f else mfix g (2-12) 
Proof Case analysis on the value of a. The True and False cases are obvious. When 
a = _L, the left hand side yields _L by Proposition 2.6.11 guaranteeing the equivalence. □ 



{left shrinking} 
{purity} 
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2.6.2 Approximation property 



Monotonicity implies that / _L always provides an approximation to mfixf: 



Proposition 2.6.3 Let / :: a — > m a. Then, 



/ _L j= mfix f 



provided mfix satisfies purity and left shrinking. 

Proof Since (Ax. / _L) C (Ax. / x), we have mfix (Ax. / _L) C mfix f by the monotonic- 
ity of mfix. But the left hand side is / _L by Proposition 2.6.11 completing the proof. □ 

Remark 2.6.4 Proposition 12.6.3 states more than a rudimentary fact: / _L yields valu- 
able information on the structure of the fixed-point. Consider the list monad, for instance. 
If / :: a — > [a], and if / _L is a cons-cell, then so is mfix f. In particular, if / _L is a finite 
list of length k, then the length of the fixed-point is k as well. In general, for any monad 
based on a sum type, / _L determines the top level structure of mfix f. 

We can now establish the strictness property in one direction (see Remark 12.1.21 ) : 

Corollary 2.6.5 Let / :: a — > m a, and mfix / = _L. Then / is strict, provided mfix 
satisfies purity and left shrinking laws. 

Proof By Proposition 2.6.31 / JL Q JL, implying that / _L = _L. □ 
2.6.3 Pure right shrinking 

The sliding property allows lifting of pure computations from the right hand side of a ~^*=: 
Corollary 2.6.6 Let / :: a — > m a, and h :: a — ► (3, 



provided mfix satisfies sliding. (On the left hand side, the value-recursion loop is over 
(a, (3), while the one on the right hand side it is over a only.) 
Proof We have 

mfix (X(x, y). f x ^= Xz. return (z, h z)) 



mfix (A(x, y). f x ^= Xz. return (z, h z)) 
= mfix f ^= Xz. return (z, h z) 



(2.13) 



mfix (map (Xz. (z, h z)) ■ f ■ tt\) 

map (Xz. (z, h z)) (mfix (f ■ ir% ■ (Xz. (z, h z)))) 

mfix f Xz. return (z, h z) 



{slide} 



Sliding applies, since (/ • 7Ti) _L = (/ -tti ■ Xz.(z, h z)) _L = / _L. 



□ 
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The diagram in this case looks like: 



r*~h 



which suggests the name pure right shrinking. 

Warning 2.6.7 In case we have lifted products, as in Haskell, the pattern matches 
against pairs should be done lazily. That is, every formula of the form: X(x, y). f x y 
should be replaced with Xt. f (tti t) (tt2 t), or the Haskell equivalent A~(x, y). f x y. 
(And similarly for triples, quadruples, etc.) For instance, Equation 2.131 should be ex- 
pressed as: 

mfix (Xt. f (tt\ t) ~^x= Xz. return (z, h z)) = mfix f ^= Xz. return (z, h z) 

or, 

mfix (X~(x, y). f x Xz. return (z, h z)) = mfix f Xz. return (z, h z) 

avoiding the strict match against the tuple. 

It is possible to generalize Equation |2.13[ so that h can use x and y as well. We call 
this variant the scope change law: 

Proposition 2.6.8 Let / :: a —> m a, h :: a — > (a, (3) — > f3 ', 

mfix (A(x, y). f x ^= Xz. return (z, h z (x, y))) 
= mfix f Xz. return (fix (X(x, y). (z, h z (x, y)))) 

provided mfix satisfies purity, left shrinking, nesting, and sliding laws. 
Proof See Appendix |B.2j 

Remark 2.6.9 Simple manipulation of the right hand side of Equation 2.141 yields the 
following equation: 

mfix (X(x, y). f x ^= Xz. return (z, h z (x, y))) 

(2.15) 

= mfix f ^= Xz. return (z, fix (Xy. h z (z, y))) 

This form of the scope changing property is quite useful in derivations, although somewhat 
less symmetric than Equation 12.141 




(2.14) 
□ 
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2.6.4 Parametricity: The "free" theorem 

The least fixed-point operator on domains satisfies the following uniformity law [60; 182] : 
Let / :: a — > a, g :: /? — > /?, and s :: a — ► /3, where s is strict. Then, 

s ■ f = g ■ s => s (fix f) = fix g (2.16) 

This extremely useful law is exactly the free theorem for the type (a — > a) — > a, and 
hence granted by virtue of parametricity in our setting [75]. For mfix, parametricity gives 
us the following theorem for free: 

Theorem 2.6.10 Let / :: a — » m a, g :: (3 — > m /3, s :: a — > /?, 

mop s ■ f = g ■ s => map s (mfix /) = m/ix g (2-17) 

provided s is strict. □ 

Remark 2.6.11 It is worth emphasizing that we use Theorem 12.6.101 freely in our treat- 
ment of value recursion^ If one takes a more abstract view, of course, we expect Equa- 
tion 12.171 to be postulated as a property to be checked, rather than taken for granted. Of 
course, this begs the question exactly what strict would mean in this new setting. See 
Simpson and Plotkin's recent work for a modern account of such questions [79] . (We will 
return to the treatment of value recursion in more abstract settings in Chapter \6\) 

As we pointed out before, sliding strict computations is a direct consequence of para- 
metricity: 

Corollary 2.6.12 Let / :: a -> m (3, h :: (3 -> a. Then, 

mfix (map h ■ /) = map h (mfix (f ■ h)) (2-18) 

provided h is strict. 

Proof Direct consequence of the free theorem with F \—* f • h,G i— > map h ■ f and 
S i—* h, where we use capital letters to identify variables in Equation 12.171 □ 

4 A word of caution is in order regarding Haskell and parametricity. It is well known that the seq 
primitive weakens the parametricity properties of Haskell [50, 68, 88]. We do not make use of this primitive 
in our work. 
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Parametricity allows us to take mirror images of our properties. For instance, the 
following equation is essentially the same as Equation |2.13j 

mfix (X(x, y). f y ^= \z. return (h z, z)) = infix f ^= Xz. return (h z, z) 

Obviously, we can consider the same equation over arbitrary length tuples and arbitrary 
permutations as well. We capture the essence of this process in the following corollary: 

Corollary 2.6.13 Let f,g :: (a, (3) — > m (a, (3). The equation mfix f = mfix g holds 
exactly when its mirror image, that is: 

mfix [map swap ■ f ■ swap) = mfix (map swap ■ g ■ swap) 
holds, where swap (x, y) = (y, x). 

Proof Simple application of Corollary 12.6.121 on both sides. Note that swap is strict. □ 
As a final corollary to the free theorem, we consider the following injection law: 

Corollary 2.6.14 Let / :: a — > m a, i :: a — > (3, p :: (3 — > a, where p is strict and 
p ■ i = id a . We have: 

mfix f = map p (mfix (map i ■ f ■ p)) (2-19) 
Proof Let F i— > map i ■ f ■ p, Gi->/, and S i— ► p in the free theorem. Again, capital 
letters denote the variables in Equation 12.171 □ 

Note that Corollary 12.6.141 also follows from the sliding property. The intended reading 
of Equation 12.191 is as follows. The function i injects a's to /3's, while p projects back. 
Hence, we can introduce spurious variables into the recursive loop, as long as they are not 
used anywhere. 

2.7 Stronger properties 

In this section we present two laws, strong sliding and right shrinking, which might be 
naively expected to be satisfied by value recursion operators. As we will prove in Chap- 
ter SI however, they are both unsatisfiable for a wide variety of monads of practical inter- 
est. The most important monad satisfying both these properties is the lazy state monad 
(Section 
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2.7.1 Strong sliding 



If Equation 12.51 holds unconditionally, (i.e., without requiring / [h _L) = / _L), we say that 
the given value recursion operator satisfies the strong sliding property. As we will see in 
Chapter [3], strong sliding is not satisfiable for a variety of practical monads. However, 
when available, it allows us to deduce several interesting equalities: 

Proposition 2.7.1 Let / :: a — > m a, and q :: a. Then, 

mfix (X(x, _). / x Ay. return (q, y)) = f q ^= Xy. return (q, y) (2.20) 



□ 



provided mfix satisfies the purity, left shrinking and strong sliding properties. 
Proof See Appendix |B.3i 

Proposition 2.7.2 Let / :: a — > m (3 , g :: j3 — > m a. Then, 

mfix (X(x, y). f x S*= Ay', g y Xx'. return (x', y')) 

= mfix (A(x, _). / x ~^*= Ay', g y' Ax', return (x', y')) 

provided mfix satisfies the purity, left shrinking, nesting and strong sliding properties. 
Diagrammatically: 



(2.21) 









/ 




s 









Proof Straightforward applications of nesting, left shrinking, and the mirror image of 
the previous proposition on the left hand side. □ 



2.7.2 Right shrinking 

Pure right shrinking (Corollary 12.6.61 ) tells us how to pull pure computations from the right 
hand side of a ^=. Although it is not possible to pull out effectful computations in general, 
there are certain monads for which it is possible to do so, the most important examples 
being the output monad (or, in general, monads based on monoids — see Section 14.51 ), and 
the lazy state monad (Section 4.4). The following property captures the situation: 

Property 2.7.3 (Right shrinking.) Let / :: a — > m a, g :: a — > m /3, 

mfix (X(x, y). f x ^= Xz. g z ^= Xw. return (z, w)) ^ 
= mfix f Xz. g z ^= Xw. return (z, w) 
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Diagrammatically: 



/ 






g 





Fact 2.7.4 Obviously, Equation 12.221 generalizes 12.131 That is, if a given value recursion 
operator satisfies right shrinking, it will automatically satisfy the pure version as well. 

The combination of right shrinking and strong sliding allow us to generalize the scope 
change law (Proposition 12.6.8.1 ) as well: 

Proposition 2.7.5 Let / :: a — > m a, g :: a — > (a,/3) — > m (3, 

infix (X(x, y). f x Xz. g z (x, y) ^= Xw. return (z, w)) 

= mfix f Xz. infix (Xb. g z (z, b)) ~^*= Xw. return (z, w) 

provided infix satisfies purity, left shrinking, nesting, strong sliding and right shrinking. 
Proof Analogous to the proof of Proposition 2.6.81 □ 



2.8 Classification and summary 

Our properties try to capture the expected behavior of value recursion operators, formal- 
izing our intuitions. It is worth reiterating the most important goals: 

• Recursion should be performed only over the values, and the fixed-point computation 
should be similar to that of fix, 

• Effects should be neither repeated nor lost, 

• In the case when there are no recursively bound variables, mdo should behave 
exactly like a do. 

How do our properties match these goals? Strictness states that the fixed-point is _L 
exactly when the given function is strict, analogous to fix. Purity states that, in case there 
are no effects, mfix should behave exactly like fix. These two properties are as close as we 
get to the behavior of the usual fixed-point operator on domains. Left shrinking states 
that mdo is exactly the same as do, in case there are no recursive bindings. We consider 
these three properties to be the most essential, leading to the following definition: 
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Definition 2.8.1 (Value recursion operators.) A value recursion operator for a monad 
(m, return) is a function mfix :: (a — > m a) — * m a, satisfying: 

• Strictness: / _U = ± ma 44> mfix f = ± ma , 

• Purity: mfix (return ■ h) = return (fix h), 

• Left shrinking: mfix (Xx. a ^= Xy. f x y) = a ^= Xy. mfix (Xx. f x y), pro- 
vided x is not free in a. 

At this point, two questions arise. First, why are sliding and nesting properties left 
out from Definition 12.8.1 L even though we have found that they are both satisfied by many 
instances of mfix in practice (see Chapter 3|? And second, are there other properties of 
interest that we have completely missed? 

The answer to the first question is a matter of choice. We would like to keep the 
requirements as simple as possible, but no simpler. As we will see several examples in 
Chapter [J, operators that do not satisfy the basic properties mandated by Definition 12.8.1 
yield results that are not very sensible for value recursion. Other properties are just as 
important theoretically, but it is our belief that they are in a secondary status from a 
practical point of view. 

It is much harder to answer the second question. Whether we have the "right" def- 
inition should become apparent as value recursion finds its place in practical functional 
programming. Our work, both in the context of this thesis and in using recursive monadic 
bindings in practical Haskell programs, led us to conclude that Definition [2T871] satisfacto- 
rily captures the minimal common core. 

Finally, a comment on uniqueness is in order. Given a particular monad, we do not 
require a unique value recursion operator for it. There may be none, exactly one, or many 
operators satisfying the requirements of Definition 12.8.11 (For instance, in Chapter [4] we 
will be able to show that identity, maybe and list monads of Haskell have unique value 
recursion operators, while the state monad has an infinite chain of them. On the other 
extreme, the continuation monad probably has none — see Chapter [5] for details.) Further- 
more, different operators for the same monad might satisfy different sets of properties in 
addition to the basic set mandated by Definition 12.8.11 In such a case, the user has the re- 
sponsibility to pick the most appropriate operator for the problem at hand, possibly using 
our properties as a guide. We will see a concrete example of this situation in Section 14.41 



Chapter 3 

Structure of monads and value recursion 



So far, our study of value recursion was set in the context of arbitrary monads. We will 
now take a closer look at various properties that monads may satisfy, such as idempotency, 
commutativity, or additivity. The aim of this chapter is to investigate the implications of 
structural properties of monads for value recursion. 

Synopsis. We first consider monads whose ~^*= operator is strict in its first argu- 
ment, covering many practical monads of interest. We show that strong sliding and right 
shrinking properties are not satisfiable for such monads. We then consider idempotent, 
commutative and additive monads, trying to identify how value recursion operators should 
behave in each case. Finally, we briefly discuss embeddings and monad transformers. 

3.1 Monads with a strict bind operator 

Consider a monad m whose operator is strict in its first argument. That is: 

±mr^ f = ±ma (3.1) 

for all / :: r — > to a. Haskell's maybe, list, 10, and strict state monads are examples of 
such monads. In this section, we will prove that neither strong sliding, nor right shrinking 
properties can be satisfied for such a monad, unless it is trivial in the following sense: 

Definition 3.1.1 (Trivial monad.) A monad (to, return) is trivial if, for all types 
r, the domain corresponding to the type m r consists only of _L mT . 

Remark 3.1.2 The canonical example of a trivial monad is: 

data Void a — no constructors, all we have is _L 

return x = _L 
to »= / = _L 
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Note that all of our properties hold for a trivial monad, with the only possible definition 
mfix f = _L. 

Lemma 3.1.3 Let (m, return) be a monad where ^= is strict in its first argument. 
If return is strict as well, then m is trivial. 1 

Proof Pick an arbitrary type r, and let a be an arbitrary element of m r. We have: 

a = const a _L T {const x y = x} 

= return T _L r ^= const a {left unit} 

= _L m T ^= const a { return is strict} 

= _L m T is strict} 

The result now follows by Definition 13.1.11 □ 



Note that Lemma 13.1.3 requires return to be strict at all types. The following lemma 
simplifies this requirement, reducing the proof obligation to return being strict at only one 
particular typej 2 ] 

Lemma 3.1.4 Let (m, ~^*=, return) be a monad where ^= is strict in its first argument. 
If return is strict at one type, (i.e., there exists a type r s.t. return T _L r = _L mr ), then it 
is strict at all types. 



Proof See Appendix |B.4j □ 
After these preliminary results, we can now proceed with our original goal: 

Proposition 3.1.5 Let (m, return) be a monad where ^= is strict in its first 

argument. If there is a value recursion operator for m that satisfies the strong sliding 



property of Section 2.7.11 then m is trivial. 

Proof We will first establish that if such an operator exists, then return must be strict. 
Define:! 

/ :: () - m () *«()-> 0 

/ () = return () h _ = () 

Note that / • h = Xx. return (). Let mfix be a value recursion operator for m satisfying 
the strong sliding property. Then, Equation 12.61 must hold with no side conditions. The 
right hand side of Equation 12.61 reads: 



1 For brevity, we simply refer to a monad (m, return) by the name of its type constructor, i.e., m. 

2 This lemma and its proof has been suggested to us by Ross Paterson (personal communication). 
3 The domain corresponding to the unit type, written () following the Haskell notation, consists of 
exactly two elements: _L and (), with the obvious ordering 
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mfix (f • h) ^= return ■ h 

and, by Proposition 12.6.11 and the left unit law, it must be equal to return (). Similarly, 
the left hand side of Equation 12.61 reads: 

mfix (Xx. f x ^= return ■ h) 

and, by the strictness property, it must compute to _L. (Note that / is strict because 
it matches its argument against (), and 3*= is strict in its first argument by hypothesis.) 
Hence, strong sliding implies return () = _L. By monotonicity, then, return must be strict 
at the type (). Hence, by Lemmas 13.1.41 and 13.1.31 m must be trivial. □ 

A similar argument shows that right shrinking property shares the same fate: 

Proposition 3.1.6 Let (m, return) be a monad where 3*= is strict in its first 

argument. If there is a value recursion operator for m that satisfies the right shrinking 
property of Section 12.7.21 then m is trivial. 
Proof Define: 

g :: [Int] — > m Int 

f :: [Int] — >• m [Int] 

g [x] = return x 

f xs = return (1 : xs) 

g _ = return 1 

It is easy to see that the left hand side of Equation 2.22 must yield _L by the strictness 
property (note that g will diverge on 1 : _L). By purity, we have 

mfix f = return (fix (Xxs. 1 : xs)) 

Hence, the right hand side of Equation 12.22 evaluates to 

return (1, fix (Xxs. 1 : xs)) 

implying that _L = return (1, fix (Xxs. 1 : xs)). By monotonicity, then, return must be 
strict at the type (Int, [Int]) . Hence, by Lemmas 13.1.41 and 13.1.31 m must be trivial. □ 

In other words, unless a given monad m is trivial, no value recursion operator for 
m can satisfy strong sliding and right shrinking properties, provided m's 3*= operator is 
strict in its first argument. This is an important result, as it identifies inherent limitations 
on properties that can be expected to hold for many practical monads of interest. 

Corollary 3.1.7 Neither strong sliding nor right shrinking properties are satisfiable for 
Haskell's maybe, list, strict state and 10 monads, as none of these monads are trivial (no 
pun intended — see Definition 13.1.11 ) . □ 
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3.2 Idempotent monads 

A monad m is said to be idempotent if the equation^ 

a Ax. a ^= Xy. return (x, y) = a ~^*= Xx. return (x, x) (3-2) 

holds for all a :: m r [46j . Identity, maybe and environment monads are examples of 
idempotent monads. Intuitively, a monad is idempotent if computations can be duplicated 
whenever their results are needed. 

Note that Equation 13 . 2 1 does not specify any data flow between repeated computations. 
That is, the equation 

Xx.fx^f = f (3.3) 

is not required to hold.@ However, if a monad is idempotent, we expect both sides of 
Equation |3.3| to be indistinguishable by infix. Furthermore, once mfix f is computed for 
a function /, further applications of / should not change the result. We capture these 
intuitions in the following property: 

Property 3.2.1 (Idempotency.) Let / :: a — > m a, where m is an idempotent monad 
with a value recursion operator mfix. Then, 



mfix (Xx. f x /) = mfix f 
mfix f ^= f = mfix f 



(3.4) 
(3.5) 



The first equality can be captured diagrammatically as follows: 





/ 










1 








r- 









We leave it to the reader to picture Equation 13.51 



Remark 3.2.2 It is important to note that Property 13.2.1 does not state that Equa- 
tion 13.41 or 13.51 can be used as definitions of value recursion operators whenever the un- 
derlying monad is idempotent.^ For instance, Equation 13.51 will always produce _L for 



4 In category theory, a monad m is called idempotent if its join :: m (m a) — > m a operator is an 
isomorphism [55 ]. The definition we use is more useful from a practical point of view, however. For instance, 
the maybe monad is idempotent with our definition, although its join operator is not an isomorphism. 

5 As a counterexample, consider the identity monad where Equation 1 3 . 3 1 is satisfied only for idempotent 
functions (i.e., f 2 = /), but not in general. 

individual definitions might coincide, of course. For instance, in Chapter \4\ we will see that Equa- 
tion [375] does indeed define value recursion operators for identity and environments, but not for exceptions. 
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mfix in a monad with a ^= operator that is strict in its first argument, which is clearly 
undesirable. 

We will discuss idempotency property with respect to identity, exception, monads 
based on idempotent monoids, and environments in Chapter 31 



3.3 Commutative monads 



A monad m is said to be commutative if the order of effects does not matter. That is, if 
the equation 



(3.6) 



A(x, y). f x ~^*= Ax', g y ^= Ay', return (x', y') 
X(x, y). g y 3*= Ay'. / x ^= Ax', return (x', y') 



holds for all / :: a — ► m (3 and g :: r — > m a. For a commutative monad, we expect mfix 
to satisfy swapping of computations similarly, as depicted in the following diagram: 



/ 


y' 




g 







y 


■ 






x' 




g 


X ' 


f 










i y' 









Property 3.3.1 (Commutativity.) Let / :: a — > m (3, g :: (5 — > m a, where m is a 
commutative monad with a value recursion operator mfix. Then, 

mfix (A(x, _). / x Ay', g y' ^= Ax', return (x', y')) ^ 

= mfix (A(_, y). g y Ax'. / x' Ay', return (x', ?/')) 

In case a value recursion operator satisfies nesting and strong sliding laws, Equation 13.7 
can be derived automatically: 

Proposition 3.3.2 Equation 13.71 follows from nesting and strong sliding laws. 
Proof Straightforward applications of Equation 12.211 Equation 13.61 nesting, left shrink- 
ing, and Equation 2.20 on the left hand side. □ 

Examples of commutative monads include identity, environments, and monads based 
on commutative monoids. We will investigate the commutativity property with respect 
to these monads in Chapter 31 
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3.4 Monads with addition 

A monad m is said to be additive if there exists an element zero :: mr, and an operation 
© :: m r — > m r — > m r, such that: 

zero ©p = p 

zero / = zero 

p © zero = p 

p S> zero = zero 

(p © q) © r = p © (q © r) 

The relation between © and ~^*= is not specified, although one generally checks for the 
following distributive laws: 

(pffi?)»/ = P^f®q^f (3.8) 
p ^= {\x. q x ® r x) = p ^= g © p ^= r (3-9) 

In Haskell, additive monads are captured as instances of the MonadPlus class, where 
zero is called mzero and © is called mplus [68]. The maybe and fof monads are instances of 
this class E It is interesting to note that the Zisi monad satisfies Equation I3.8L but not 13.91 
while the maybe monad satisfies Equation 13.91 but not 13.81 

For an additive monad, we expect the following property to hold: 

Property 3.4.1 (Distributivity.) Let m be an additive monad with © as the binary 
operator. Let mfix be a value recursion operator for m. Distributivity states: 

mfix (Ax. / x © g x) = mfix f © mfix g (3.10) 

If Equation 13.81 holds, left shrinking is sufficient to establish the distributivity property: 

Proposition 3.4.2 Let m be an additive monad with © as the binary operator, and let 
mfix be a value recursion operator for m. If © satisfies Equation 13.81 then mfix will satisfy 
distributivity. 

Proof See Appendix IB. 51 □ 

Remark 3.4.3 It is worth noting that Equation |3.8| is a sufficient, but not a necessary 
condition for satisfying distributivity. As we will see in Chapter the maybe monad does 
not satisfy Equation 13.81 Y e t it has a value recursion operator satisfying distributivity. 



7 In fact, the law p 3> zero = zero fails for both the maybe and list monads when p = X. This 
discrepancy does not cause any trouble for our purposes. (Recall: m 3> k — m ^= A_.fc.) 
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3.5 Embeddings 

Consider Haskell's maybe and list monads. Intuitively, every value of type Maybe r can 
be considered as a value of type [r], mapping Nothing to [ ] and Just x to [x]. In a certain 
sense, the list monad is rich enough to capture the features of the maybe monad. Formally, 
this relation is captured by monad homomorphisms and embeddings |531 [89] : 

Definition 3.5.1 (Monad homomorphisms and embeddings.) Let m and n be two mon- 
ads. A monad homomorphism, e :: m — > n, is a family of functions, one for each type r, 
€ T :: TTi t — > n t, such that: 

e • return m = return n (3-H) 
ea(k^ m f) = e T k^ n e a -f (3.12) 

where k :: m r and / :: r — > m a. An embedding is a monad homomorphism where each 
e T is monic (i.e., injective). 

Equations 13.111 and 13.121 precisely describe how e interacts with the proper morphisms 
of the involved monads. For value recursion, we also need to specify how e and mfix 
interacts: 

Definition 3.5.2 (Monad homomorphisms and embeddings for value recursion.) Let m 
and n be two monads with respective value recursion operators mfix m and mfix n . Let 
e :: m — > n be a monad homomorphism or embedding. We say that e respects value 
recursion if, for all / :: r — ► m r, 

e (m/iar m /) = mfix n (e ■ f) (3.13) 

In Chapter Hj, we will see several concrete examples, including the embeddings of maybe 
into list, environment and output into state, and identity into any other monad. 

Proposition 3.5.3 Let e : m — > n be an embedding of a monad m into a monad n. Let 
mfix n be a value recursion operator for n. Let g :: (r — > m r) — > m r be a function, 
satisfying the strictness property. If e satisfies Equation 13.131 where g plays the role of 
mfix m , then g is a value recursion operator for m, i.e., it will satisfy purity and left 
shrinking properties as well. 

Proof Simple equational reasoning. We present the left shrinking case to illustrate the 
idea: 
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e (g (Ax. a »= Xy. f x y)) 

= mfix (Xx. e (a ^= Xy. f x y)) {Eqn. |3. 13j j- 

= m/ix (Ax. e a ^= Ay. e (f x y)) {Eqn. 13.12} 

= e a Ay. m/ix (Ax. e (/ x y)) {Ze/t shrink} 

= e a Ay. e (5 (Ax. / x y)) {Eqn. 13.13} 

= e ( a »= Ay. y (Ax. / x y)) {Eqn. S32} 

Since e is injective, we obtain: 

g (Xx. a »= Ay. / x y) = a »= Ay. y (Ax. / x y) 
showing that y satisfies left shrinking. □ 

Remark 3.5.4 It is unfortunate that strictness is not necessarily reflected. Using the 
proof technique above, one gets: e (g /) = mfix n (e • /), but we cannot conclude that g 
satisfies strictness unless e is strict. It turns out that requiring e to be strict is an overkill; 
many embedding examples we will see in Chapter S] are not strict. 

Proposition 3.5.5 The sliding, nesting, strong sliding and right shrinking properties are 
reflected through embeddings as well. That is, if e : m — > n is an embedding respecting 
value recursion, and if mfix n satisfies any of these properties, then so will mfix m . 
Proof Similar to the previous proposition. □ 

Observation 3.5.6 Composition of two embeddings is still an embedding, hence prop- 
erties are reflected through multiple embeddings as well. 

Is it possible to derive value recursion operators using embeddings? Intuitively, if a 
monad m embeds into another monad n, and if n has a value recursion operator, one 
might expect to be able to derive a value recursion operator for to. In this case, we will 
need the embedding to be a split monic, i.e., to possess a left inverse, in order to be able 
to map results back to m. For instance, the embedding of the maybe monad into the list 
monad, and its left inverse, are given by: 

e Nothing = [ j e [] = Nothing 

e (Just x) = [x] e £ (x:xs) = Just x 

More formally, let e :: m — > n be an embedding with the left inverse e :: n — > to, i.e., 
e\ ■ e T = id mT . Note that, in general, is not a monad homomorphism.§ Let mfix n be a 



furthermore, e and e l are not required to form a retraction pair, i.e., e • g id [77]. In fact, e • e l is 
generally incomparable to id, as demonstrated by the embedding of maybe into list. 
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value recursion operator for n. When is the function: 

g :: (a — > m a) — » m a 

gf = e e (mfix n (e • /)) (3.14) 

a value recursion operator for m? Since is not a monad homomorphism, not all required 
properties will follow automatically. Still, this construction gives a way of obtaining a 
candidate value recursion operator, and we can test whether e respects value recursion 
with respect to it. In this case, we need to verify: 

(e-e e ) {mfix n (e ■ /)) = mfix n (e ■ /) (3.15) 

for all / :: a — > m a. If Equation 13.15 holds, Propositions 13.5.3 and 13.5.5 will be sufficient 
to establish properties for g automatically. 

Remark 3.5.7 It is easy to see that e e will always satisfy Equation 13.111 In general, 
Equation 1 3 . 1 2 1 will only be satisfied on the subset of values that are in the image of e. The 
maybe into list embedding given above illustrates this point. However, we suspect that the 
subset of values on which Equation 13.12 is satisfied might be sufficient to establish further 
properties of the derived operator. We leave the exploration of this idea for future work. 

3.6 Monad transformers 

Closely related to monad homomorphisms is the idea of monad transformers. It is often 
the case that one wants to add new features to an already existing monad. For instance, 
one can add exceptions, state or non-determinism to a monad, obtaining a monad with 
new computational features. Monad transformers have been designed to solve this problem 
in a modular manner. Intuitively, given a monad m, a monad transformer t yields a new 
monad t m, transforming return m to returnt m and ^= m to 3*=t m . Furthermore, one 
requires a monad homomorphism lift T :: m r — > t m r, lifting computations in m to the 
new monad. We refrain from going into details here, the reader is referred to the rich 
literature on monad transformers for details |22[ [421 1531 154] . 

For value recursion, we ask a similar question. Given a monad transformer t, is there 
a natural way of obtaining mfix t m from mfix m ? A generic approach would be to convert a 
given function f :: a — > t m a to & function of type a — > m a, apply mfix m to get the fixed- 
point m a, and transfer it back to t m a using lift. Unfortunately, to do the conversion 
from a — > t m a to a ^ m a, one would need a morphism with type t m a — > m a, the 
inverse of lift, which is clearly not available in general. 
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On the other hand, it is generally possible to lift arbitrary value recursion operators, 
provided we know the exact structure of the monad transformer. We will consider three 
examples of monad transformers in Section |4.9[ namely errors, environments, and state, 
and show how we can lift the value recursion operators through these transformers. (This 
technique does not always work, however, as illustrated by the continuation monad trans- 
former. See Section [5T21 for details.) 

3.7 Summary 

In this chapter, we have concentrated on properties of value recursion operators that follow 
from the structural properties of underlying monads. As we have seen, if the 3*= operator 
is strict in its first argument, then the strong sliding and right shrinking properties cannot 
be satisfied. This is an important point: there are inherent limitations on what we can 
expect from recursion in the presence of effects. (We will return to this issue in Chapter [61 ) 
The latter part of this chapter dealt with how value recursion operators reflect prop- 
erties such as idempotency, commutativity, and additivity, and how individual properties 
are reflected through monad embeddings. In Chapter [4j we will get a chance to review 
these properties with respect to concrete examples of value recursion operators. 



Chapter 4 
A catalog of value recursion operators 



In this chapter, we present value recursion operators for monads that are frequently used 
in functional programming, providing a catalog of infix's for the working programmer. 
Although there is no magic recipe, we believe that these examples present enough patterns 
to guide the construction of value recursion operators for new monads. 

Synopsis. We establish a framework with the identity monad and then cover excep- 
tions, lists, state, output, environments, trees, and fudgets. The continuation monad 
proves to be problematic; we consider it separately in Chapter [5J We also discuss monad 
transformers, enabling us to create new infix's from old. 

4.1 Identity 

The identity monad is the monad of pure values, modeling computations with no effects: 

type Identity a = a 

return = id 
x »= f = f x 

with fix as the corresponding value recursion operator, i.e: 

mfix :: (a — ► Identity a) — > Identity a (4 1) 

mfix f = fix f 



Proposition 4.1.1 Equation 14.1 defines the unique value recursion operator for the 
identity monad. 

Proof It is easy to show that fix satisfies strictness, purity, and left shrinking properties. 
For uniqueness, we will show that any value recursion operator for the identity monad must 




equal fix. Let mfix' be such an operator. We have: 



mfix' f 



mfix' (return ■ /) = return (fix /) = fix f 



by using purity and the fact that return = id. 



□ 
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Remark 4.1.2 Although we will stick to Haskell notation, we will generally avoid using 
explicit tags to reduce clutter as long as we can. For instance, for overloading purposes, 
the proper way to define the identity monad and infix in Haskell is:@ 

newtype Identity a = Id { unld :: a } 

instance Monad Identity where 
return x = Id x 
Id x »= f = f x 

mfix f = fix (/ • unld) 

Properties It is easy to see that all of our properties hold for Equation 14. 1[ including 
nesting, strong sliding and right shrinking. Furthermore, the identity monad is both 
idempotent and commutative, and it is an easy exercise to show that Properties 13.2.1 
and 13.3.1 both hold. 

The identity monad embeds into any other monad n, as long as return n is monic. 
The homomorphism e = return n easily satisfies Equations 13. 1 lti3. 131 assuming n has a 
value recursion operator. (In other words, the identity monad is initial in the category of 
monads and monad homomorphisms.) 

4.2 Exceptions: The maybe monad 

The maybe monad of Haskell can be used to model exceptions: 

data Maybe a = Nothing \ Just a 

return = Just 

Nothing ^= f = Nothing 
Just x / = / x 

with the following unique value recursion operator: 

infix :: (a — > Maybe a) — ► Maybe a 

infix f = fix (/ • unjust) (4-2) 
where unjust (Just x) = x 

Proposition 4.2.1 Equation |4.2 defines the unique value recursion operator for the 
maybe monad. 

1 The newtype declaration avoids adding a separate _L element. If a data declaration is used, ^= should 
match lazily (i.e., ~(Id x) ~^?= f = f x) to avoid strictness problems. (See Section [3.1| for details.) 
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Proof Strictness and purity are straightforward. For left shrinking, we need to show: 

infix (Xx. a Ay. f x y) = a ^= Ay. mfix (Ax. / x y) 

where a is a free variable. Case analysis on a suffices to show the equivalence. When 
a = _L, both sides yield _L. When a = Nothing, we get Nothing. Finally, when a = Just z 
for some z, both sides yield mfix (Xx. f x z). 

To show uniqueness, we do a similar case analysis. If / _L = _L, mfix f must be 
_L by strictness. If / _L = Nothing, monotonicity implies that / = const Nothing, and 
Proposition 12.6.1 guarantees that mfix f = Nothing . Finally, if / _L = Just z for some 
z, then / must factor through Just by monotonicity, i.e., there must be a function h such 
that / = Just ■ h, or equivalently, h = unjust ■ f. Therefore, 

mfix f = mfix (Just • h) 
= mfix (return ■ h) 
= return (fix h) {purity} 
= return (fix (unjust • /)) 

To summarize, we have: 

mfix f = case / 1 of 

Nothing -> Nothing (4.3) 
Just _ — > return (fix (unjust ■ /)) 

Note that we did not make any choices in constructing Equation 4.31 the behavior of 
mfix is completely dictated by the properties that must be satisfied by all value recursion 
operators. We leave it to the reader to show that Equations 14.21 and 14.3 are equivalent, 
establishing uniqueness. □ 

Remark 4.2.2 By Proposition 12.6.31 / 1. is always an approximation to mfix f, justify- 
ing the case expression in Equation 4.31 Note that the case when / _L = _L is implicitly 
handled by pattern match failure. 

Properties It is easy to show that Equation 14.2 also satisfies sliding and nesting prop- 
erties. As stated in Corollary 13.1.71 strong sliding and right shrinking properties fail. 

How about idempotency (Proposition 13.2.11 ) and commutativity (Proposition 13.3.11 )? 
It turns out that the exception monad is indeed idempotent (i.e., satisfies Equation |3.2|) . 
Equations |3.4| and 13.51 are both satisfied. On the other hand, exceptions are not commuta- 
tive, due to the possibility of non-termination: Nothing ~^*= Xx. _L = Nothing, whereas 
_L ^= Ax. Nothing = _L. Consequently the commutativity property is not applicable. 
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Finally, we consider the distributivity (Property 3.4.1). As mentioned in Section |3.4[ 
the maybe monad is additive: 

zero = Nothing 

Nothing © y = y 
Just x © y = Just x 

To establish 

mfix (Ax. / x © g x) = mfix f © mfix g 

it suffices to do a case analysis on / _L. In case / X = _L, both sides will yield _L. In case 
/ J_ = Nothing, we will get mfix g on both sides. Finally, if / _L takes the form of a Just, 
both sides will reduce to mfix f. We leave the details to the reader. 

Remark 4.2.3 It is instructive to study failing definitions of mfix as well. Consider: 

mfix' f = let Just x = f x 
in return x 

which is somewhat intuitive, considering how the recursive knot is tied over x. Obviously, 
strictness fails. More seriously, left shrinking fails as well: 

mfix' (Ax. Nothing ^= Xy. return 1) = Just _L 
Nothing ^= Xy. mfix' (Ax. return 1) = Nothing 

compromising the equivalence of do and mdo expressions in the absence of recursion. We 
also have mfix' (Xx. Nothing) = Just _L, which is truly bizarre. 

4.3 Lists 

The list monad of Haskell can be used to model computations with multiple results: 

return x = [x] 

[} *= / = [] 

(x:xs) ^ / = / x -H- (xs »= /) 

Given a function f :: a — * [a], how do we compute mfix f :: [a] ? Intuitively, we need 
to select a pivot value to tie the recursive knot. Consider the following two candidates: 

let (a : _) = / a let (_ : a : _) = / a 

in / a in / a 
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where we pivot over the first and the second element of the result, respectively. Of course, 
there is an infinite family of such functions, one for each particular position. As we will 
see later in this section, none of these alternatives give rise to a value recursion operator. 
Instead, we consider a moving pivot: Rather than fixing a single pivot element for the 
entire computation, we compute each element in the result using its own position as the 
pivot element. That is, the ith element of the fixed-point of / can be selected as the 
fixed-point of the function head 1 ■ /, suggesting: 

mfix f = fix (head ■ f) : mfix (tail ■ f) 

There is a slight problem with this approach, however: It always generates an infinite list, 
repeating _L after reaching the actual end of the list. Luckily, there is an easy solution. 
Rather than computing fix (head • /), we can compute fix (f ■ head), and stop when 
the result is []. Putting these ideas altogether, we obtain the following operator: 

mfix :: (a — > [a]) — > [a] 
mfix f = case fix (f ■ head) of 
[] - [] 

(x:_) — > x : mfix (tail ■ f) 
As the following proposition shows, this definition of mfix is extremely well-behaved: 
Proposition 4.3.1 The function mfix given by Equation 14.41 satisfies: 



mfix f = J_ 
mfix f = [] 
head (mfix f) 
tail (mfix f) 
mfix (Ax. f x : g x) 
mfix (Xx. f x -H- g x) 
Proof See Appendix IB. 61 



^ / 1 = ± (4.5) 

O / -L = [] (4.6) 

= fix (head • /) (4.7) 

= mfix (tail ■ f) (4.8) 

= fix f : mfix g (4.9) 

= mfix f -H- mfix g (4-10) 

□ 



Remark 4.3.2 From the first two equivalences in Proposition 14.3.11 we see that mfix f 
structurally follows / _L. That is, if mfix f is _L or [], then so is / _L, and vice- versa. 
Similarly, mfix f is a cons-cell exactly when / _L is. We see this correspondence over and 
over in monads that are based on sum-like data structures. (See also Remark 12.6.41 ) 

Proposition 4.3.3 Equation 14.41 defines the unique value recursion operator for the list 
monad. 
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Proof Strictness is exactly the first equivalence in Proposition 14.3.11 Purity is easy to 
establish; we leave it to the reader. Left shrinking is more interesting. We show: 

infix (Xx. a ~^*= Xy. f x y) = a ^= Xy. mfix (Xx. f x y) 

by structural induction on a. The base cases, a = J_ and a = [], are immediate. For the 
inductive step, we assume a = q : qs, and reason as follows: 

mfix (Xx. (q : qs) »= Xy. f x y) 
= mfix (Xx. f x q -H- qs Ay. / x y) 

= mfix (Xx. f x q) -H- mfix (Xx. qs ^= Xy. f x y) {Eqn. I4.10I }- 

= mfix (Xx. f x q) -H- qs ^= Xy. mfix (Xx. f x y) {I-H-} 
= (Ay. mfix (Xx. f x y)) q H+ qs 3>= Xy. mfix (Xx. f x y) 
= (q : qs) ^= Xy. mfix (Xx. f x y) 

establishing the left shrinking property, and completing the proof that we have a legitimate 
value recursion operator. 

For uniqueness, we will appeal to the approximation lemma.@ Let mfix refer to the 
function defined by Equation 4.4L and let mfix 1 be another value recursion operator for the 
list monad. We will show that: 

Vn.V/. approx n (mfix f) = approx n (mfix' f) 

establishing uniqueness. The proof is by induction on n. The base case (n = 0) is 
immediate. The induction hypothesis is: 

V/. approx k (mfix f) = approx k (mfix' f) 

for a fixed natural number k. We need to show that: 

V/. approx (k+1) (mfix f) = approx (k+1) (mfix' f) 

Pick an arbitrary function /. The proof proceeds by case analysis on the value of / _L. 
If / _L = _L, then both sides yield _L by the strictness property. If / _L = [], then 
/ = const [] by monotonicity, and both sides yield [] by Proposition 2,6.11 The case 
when / _L is a cons- cell is a bit more involved. By monotonicity, we have 

f x = (head ■ /) x : (tail ■ /) x = [(head ■ /) x] 4+ (tail • /) x (4.11) 

for all x, since / will always produce a cons-cell given any argument. Furthermore, the 
list monad satisfies Equation 13.81 and hence mfix 1 must satisfy Equation 13.101 by Proposi- 
tion 13.4.21 where © = 4+. Now, it is easy to see that: 



2 See Appendix IB. 61 for a more detailed example use of this lemma. 
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mfix' f = mfix' (Ax. [(head • /) x] 4+ (tail ■ f) x) {Eqn. 14.11} 
= mfix' (return ■ head ■ f) 44- mfix' (tail ■ f) {Eqn. 13.10} 
= return (fix (head • /)) 4+- mfix' (tail ■ f) {purity} 
= [fix (head ■ f)] 44- mfix' (tail ■ /) 
= fix (head ■ /) : mfix' (tail ■ f) 

Also note that Equation 1 4. 4 1 will take its second branch when / _L is a cons-cell. Therefore, 
the proof obligation reduces to: 

head (fix (f ■ head)) : approx k (mfix (tail ■ /)) 
= fix (head ■ f) : approx k (mfix' (tail ■ /)) 

by the definition of approx, and the above derivation. But this equation is immediate: 
First elements are equivalent by the dinaturality of fix, and the tails are equivalent by the 
induction hypothesis. □ 

Properties It is not very hard to show that the sliding and nesting properties hold. By 
the last equation in Proposition 4-3.ll distributivity holds as well (Property |3.4.1| ). On 
the negative side, both strong sliding and right shrinking properties fail, as pointed out in 
Corollary 13.1.71 

Remark 4.3.4 The maybe monad embeds into the list monad, as described in Sec- 
tion 13.51 Furthermore, the value recursion operator for the maybe monad is exactly the 
one predicted by Equation 13.141 

Remark 4.3.5 We close this section by discussing failing definitions of mfix for the list 
monad. Consider the function: 

f xs = [take 3 (1 : ass), take 3 (2 : ass)] (4.12) 

What should mfix f be? Our definition yields: [[1, 1, 1], [2, 2, 2]], but the reader might 
wonder about [[1, 1, 1], [2, 1, 1]], or [[1, 2, 2], [2, 2, 2]], which are produced by the 
two alternatives we have seen at the beginning of this section, i.e., by pivoting over the 
first and second elements of the result. As we have mentioned, there is an infinite family 
of such operators J 3 ) 

mfixi f = fix (f ■ head- taif), i>0 ( 4 -13) 



3 Note that these alternatives do not form a chain; they are all incomparable. Furthermore, they are all 
incomparable to our definition of mfix (i.e., Equation |4.4[ ) as well. 
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How about properties? It is easy to see that strictness holds for all mfix^, but that's 
where the good news ends. Except for mfix 0 , all members violate purity. We have: 

infix j (return ■ f) = return (/ _L), i > 0 

Furthermore, the left shrinking property fails for all of them. For instance, 

mfix 0 (Ax. [1, 2] ^= Ay. [y, x}) = [1, 1, 2, 1] 
[1, 2] ^ Ay. mfix 0 (Ax. [y, x]) = [1, 1, 2, 2] 

compromising the equivalence of do and mdo expressions in the absence of recursion. 
Intuitively, these definitions cause interference between elements. Note that: 

Ax. [1, 2] ^= Ay. [y, x] = Ax. [1, x, 2, x] 

and there is no reason to expect anything but _L to play the role of x in the fixed-point, 
as there is no information on what it can be. Indeed, our definition of mfix yields: 

mfix (Ax. [1, 2] »= Ay. [y, x]) = [1, _L, 2, J_] 
[1, 2] 3= Ay. mfix (Ax. [y, x]) = [1, _L, 2, J_] 

In the light of this discussion, we see that neither the list [[1, 1, 1], [2, 1, 1]], nor the 
list [[1, 2, 2], [2, 2, 2]] constitute a viable fixed-point for the function defined by Equa- 
tion 14.121 Each indicate interference between the elements of the fixed-point, violating 
the left shrinking property. 

In Section [97X1 we will see an example use of value recursion on the list monad, providing 
practical evidence for the definition given by Equation 14.41 being preferable over those given 
by Equation 4.131 

4.4 State 

State monads capture the notion of computations that depend on modifiable stores, pro- 
viding safe access to imperative features [511 152] . A typical state monad, manipulating an 
internal state with type r, has the following structure f7\ [91] : 

type ST t a = r — > (a, r) 

return x = As. (x, s) 
f ^= g = As. let (a, s') = f s 
in gas' 
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The corresponding value recursion operator is given by: 



mfixu :: (a — ► ST r a) 
mfixu f = As. let (a, s') = 
in (a, s') 

(The reason for the name will be clear in a moment. 



-> ST r a 
fas 



(4.14) 



Remark 4.4.1 The following picture depicts the operation of the value recursion oper- 
ator for the state monad, providing the intuition for the diagrams we have been using so 



far (see also Remark 12.2.21) : 



state in 



value out 



state out 



The monads we have considered up to now (i.e., identity, exceptions, and lists) enjoy 
the property that they all have unique value recursion operators. Is this the case for 
the state monad as well? Referring to the picture above, we see that the resulting state 
transformer is required to return the fixed-point value in the value out line in order to 
satisfy purity, but it is not clear how we should determine the final state, i.e., the value of 



the state out line. Equation 14.141 captures the case when state out is obtained by running 
/ on the fixed-point value and the current state. It is possible to consider an alternative 
semantics, where the resulting state is determined without any regard to the value part, 
i.e., without any use of the fixed-point value. That is, a definition of the form: 



mfix f = As. let (a, _) = / a s in (a, TT2 (/ JL s)) 
with the following picture: 



(4.15) 



/ 



value out 



state in 



-O 



state out 



We might think of this operator as being strictly sequential in the state, i.e., it does 
not make use of any "future" knowledge in determining what the final state should be. 
There is a whole family of such operators, using approximations to the fixed-point value: 



mfixi f = As. let (a, _) = / a s in (a, pick{ f s), i > 0 



(4.16) 
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where 



picki f s = 7T 2 (/ ((Ao. 7Ti (/ a s)Y _L) s) 
For instance, the picture for pick 2 is: 



/ 




/ 




/ 







(4.17) 



Note that mfix 0 is precisely the operator defined by Equation 14.151 By construction, 
each pic^ is an approximation to the next, i.e., pick i C pick i+1 , implying mfix i C mfix i+1 . 
Furthermore, it is easy to see that: 



m fi x u> = U m fi a 



(4.18) 



i=0 



where the mfix^ on the left hand side is the operator defined by Equation 14.141 



Example 4.4.2 The functions mfix i , for all i, and mfix^ will always agree on the value 
part of the fixed-point. It is the final state that will be approximated by each mfix^, the 
limit being delivered by mfix^. To demonstrate, consider the following function: 

/ :: [Int] -» ST [Int] [Int] 

f xs s = (1 : xs, xs) 



We have: 



i times 



7T 2 (mfixi f []) 



1:1: 



1 : _L 



As expected, tt2 (mfix^ f [ ]) yields the infinite list of l's. Notice how approximations are 
reflected in the final state. (In all cases ir\ (mfix f [ ]), i.e., the value part, will always be 
the infinite list of l's.) 



Proposition 4.4.3 The functions mfix^, for all i (Equation 14.16]) , and mfix w (Equa- 
tion 14.14) are value recursion operators for the state monad. 

Proof For brevity, we will only consider mfix^ here. Proofs for mfix i are a bit more 
tedious, but equally easy. For strictness, we note that a function / of type a — > ST r a is 
strict exactly when / _L Q s = (J- a , _L T ) for all s. We have: 

m fi^Lj f s = (a, s') = f a s in (a, s') 

= let a = fix (Aa. tt\ (/ a s)) in (a, tx<i (/ a s)) 
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Because the function Xa. tt\ (f a s) is strict, its fixed-point is JL. Therefore, mfix w f s = 
(_L,_L), establishing that mfix^ f is J_]j 
For purity, we have: 

mfix w (return ■ f) = As. let (a, s') = (return • /) a s in (a, s') 

= As. let (a, s') = (/ a, s) in (a, s') 

= As. let a = fix (Xa. f a) in (a, s) 

= As. (fix /, s) 

= return (fix f) 

For left shrinking, we need to show that: 

mfiXu (Xx. g ^ Xy. f x y) = g »= Ay. m/ia^ (Ax. / x y) 

Simple symbolic manipulation reduces both sides to: 

As. let (a, s') = g s 

(a', s") = f a' a s' 
in (a', s") 

completing the proof. □ 

Remark 4.4.4 Abusing the terminology a bit, one might consider mfix^ as a lazy-in- 
the-state value recursion operator, while mfix 0 is strict. As we will see in Section 14.81 
and in Chapter \8\ in detail, the operation of mfix 0 is quite similar to the operation of 
value recursion operators for stream processing and 10 monads. It is hard to develop 
a corresponding intuition for mfix i when i ^ 0. We do not know any applications that 
might benefit from them. Furthermore, they behave strangely with respect to the nesting 
property, as we will see shortly. 

Properties Having established that mfix^ for all i, and mfix w are value recursion oper- 
ators, we now take a look at other properties. It turns out that sliding (Property 12.4.11 ) 
is satisfied by all of them, but nesting (Property 12.5.11 ) only holds for mfix 0 and mfix^. 
Strong sliding and right shrinking properties only hold for mfix^. 

Counterexample 4.4.5 Let us first consider nesting. Let 



4 We caution the reader about the use of true products. In case of lifted products, we would get 
mfix f — As. (_L, _L) 7^ As. _L, violating strictness. But this is hardly surprising — even monad laws fail 
in this case. It is easy to see that (As. _L) ^= return — As. (_L, _L), failing the right unit law. 
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/ :: ([Int], [Int]) -» ST [Int] [Int] 

f X S = (1 : TTi X, 7T2 x) 

Considering left and right hand sides of Equation 2.7[ for each % > 0, we have: 

7T 2 {mfixi (Ax. mfixi (Ay. / (x, y))) [ ]) = T +1 : _L 
vr 2 (m/ix^ (Ax. / (x, x)) []) = V : _L 

where l fc denotes a list of A; l's. Since the final states differ, nesting fails. (The value part 
will be the infinite list of l's in both cases.) For the single call to mfix i in the second line, 
we simply get a snapshot of the value after i iterations, that is, exactly i l's. The nested 
calls to mfixi, and hence to pick^, result in the extra 1 in the first line. This behavior is 
truly bizarre from the viewpoint of value recursion. In case of mfix 0 , the final states will 
both be _L, since the inner call to pick will be ignored by the outer one. In case of mfix^, 
the final state will be the infinite list of l's, as expected. 
For strong sliding (Section 2.7,1) ), consider: 

/ :: [Int] -» ST [Int] [Int] h :: [Int] -» [Int] 

f xs s = (xs, xs) h xs = 1 : xs 

Note that f (h ±) = As.(l : _L, 1 : _L) ^ As.(_L, _L) = / _L, hence sliding (Property [2T47TT ) 
does not apply. Considering Equation 12.51 we have: 

7T 2 (mfixQ (map h ■ f) []) = _L 
7T 2 (map h (mfixQ (f ■ h)) []) = 1:1 

showing that strong sliding fails. For right shrinking (Property 12.7.31 ), let 

/ :: [Int] -> ST [Int] [Int] g :: [Int] -> ST [Int] [Int] 

f xs s = (l:xs, xs) g xs (_ : k : _) = (xs, [k]) 

We leave it to the reader to show that right shrinking fails for mfix 0 with this instantiation. 

It is possible to generalize these examples for all other mfix i , whenever i is finite. 

Remark 4.4.6 We do not know whether there are other value recursion operators for 
the state monad. 

4.5 Output monad and monads based on monoids 

Every monoid gives rise to a monad, referred to as its representation monad [2]. In pro- 
gramming, the best known example is the output monad, as we will see shortly. Let 
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(M, ©, unit) be a monoid, where M is the underlying type. The corresponding represen- 
tation monad is given by: 

type RepM a = (a, M) 

return x = (x, unit) 
ma f = let (a, m) = ma 
(b, n) = f a 
in (b, m © n) 

For instance, substituting String for M, "" for unit, and -H- for ©, one obtains the usual 
output monad [7J, SI]. The obvious value recursion operator is given by: 

mfix^ :: (a — > RepM a) — > RepM a 

m fi x uj f = ( a > m) = / a in (a, m) (4.19) 

As with the state monad, the choice of the name mfix^ is not arbitrary. We have a 
family of recursion operators: 

wi/ixj / = let (a, _) = / a in (a, pic/c, /), i > 0 (4.20) 

where 

J»CA* / = 7T2 (/ ((tq • fY ±)) (4.21) 

A straightforward calculation (analogous to Equation |4.18| ) shows that: 



./'>., = U mfixi (4.22) 



m 

1=0 



The correspondence with the state monad is not accidental. Any such representation 
monad embeds into the state monad via the embedding: 

e (a, m) = An. (a, n © m) (4.23) 

with the left inverse: / = / unit. Furthermore, e works uniformly over all value 
recursion operators, including mfix^. That is, for any monoid M: 

e (mfix^ epM f) = mfixf T (e • /) (4.24) 

where i is either a natural number orw. It is an easy exercise to show that the embedding 
requirements (i.e., Equations 13.11 tf3 .131 ) hold for e. 
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Properties By Proposition 13.5.31 whenever an mfix for the state monad satisfies pu- 
rity or left shrinking, the corresponding operator for the representation monad of a given 
monoid will satisfy it too. Note that e is not strict, hence strictness is not automati- 
cally guaranteed (see Remark |3.5.4[ ). However, it is easy to see that all mfix i and mfix^ 
satisfy strictness. Therefore, we have an infinite family of value recursion operators for 
representation monads, similar to the case for the state monad. 

By Proposition 13.5.51 sliding, nesting, strong sliding, and right shrinking properties 
hold whenever the corresponding operator for the state monad satisfies them. On the 
negative side, all of the counterexamples we gave for the state monad can be converted 
to counterexamples for representation monads with no difficulty, invalidating nesting for 
mfix i when i > 0, and strong sliding and right shrinking for all but mfix w . 

If the underlying monoid is idempotent, the representation monad will be idempo- 
tent as well. Similarly, commutativity of the monoid implies the commutativity of the 
monad. In both cases, mfix^ will preserve idempotency and commutativity (Proper- 



ties [3T27T] and [3T371J . Unfortunately, this result does not extend to mfix i automatically! 5 ] 

Remark 4.5.1 Similar to the case for the state monad, it is an open question whether 
there are other value recursion operators for monads based on monoids. 

4.6 Environments 

The environment monad, also known as the reader monad, captures computations that 
use a store to read values without modifying them. Using an environment of type p, the 
environment monad has the following structure: 



type Env p a = p — > a 



return x = Xe. x 



f »= g = Xe. g (/ e) e 



The corresponding value recursion operator is given by: 



mfix :: (a — > En 
mfix f = Xe. let a 




f a e 



(4.25) 



in a 



For instance, Equation 13.41 will hold for mfix 0 only when 7T2 (/ _L) © tt2 (/ (tti (/ J-))) = 1^2 (/ -L), 
which is not guaranteed just by the fact that © is idempotent. Similar arguments apply to Equations 13.5 
and 3.71 as well. 
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Remark 4.6.1 It is an easy exercise to show that Equation 14.251 is equivalent to the 
generic mfix given in Section 1.41 To the best of our knowledge, identity and environ- 
ment monads are the only examples where the generic version acts as the value recursion 
operator. 



Unsurprisingly, the environment monad embeds into the state monad. The embedding 6 ! 
is given by e / = As. (/ s, s), with the left inverse e / = 7Ti • /. It is easy to see 



that strictness holds for Equation 4.251 Therefore, Proposition 13.5.3 guarantees that 



Equation 14.25 defines a value recursion operator for the environment monad. 



Properties By Proposition 13.5.5 and the observations made above, Equation 14.25 sat- 
isfies all the properties satisfied by mfix^ of the state monad. That is, sliding, nesting, 
strong sliding, and right shrinking properties, along with the basic requirements of strict- 
ness, purity and left shrinking are all satisfied. 

Finally, the environment monad is both idempotent and commutative, and Proper- 
ties 13.2.1 and 13.3.1 are both satisfied. 

Remark 4.6.2 We do not know whether Equation 14.25 defines the unique value recur- 
sion operator for the environment monad. 

4.7 Tree monad 

In this section, we will briefly cover the tree monad [42]: 

data Tree a = Leaf a \ Fork (Tree a) (Tree a) 

return x = Leaf x 

Leaf x f = f x 

Fork I r ^ / = Fork (I ^= f) (r ^ /) 

The effect of ^= is to splice new subtrees on every Leaf of the first argument. The 
corresponding value recursion operator is given by: 

mfix :: (a — > Tree a) — > Tree a 

mfix f = case fix (f ■ unL) of 

; (4.26) 
Leaf x — > Leaf x 

Fork — > Fork (mfix (le ■ /)) (mfix (rc ■ /)) 

The functions unL, le, and rc are defined as follows: 



6 It does not matter which mfix is chosen for the state monad (i.e., mfix^ of Equation |4.14[ or any mfiXf 



given by Equation |4. 16| ) . 
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Ic, rc :: Tree a — > Tree a 

unL :: Tree a —>■ a 

Ic [Fork I r) = I 

unL (Leaf x) = x 

rc (Fork I r) = r 

Compared to the value recursion operator for the list monad (Equation 14.41 ), we see 
that unL plays the role of head, while tail is replaced by Ic and rc, projecting out the 
children at each node. Otherwise, the definitions are structurally the same. 

Remark 4.7.1 Despite all the similarities, the list monad does not embed into the tree 
monad. There is no suitable element to map [] to, since our trees are always non-empty. 
(An alternative formulation of trees, where data is stored in the nodes and leaves are 
empty, does not give rise to a monad structure.) 



Proposition 4.7.2 The function mfix given by Equation 14.261 satisfies: 

mfix f = ± ^ / i_ = i_ (4.27) 

unL (mfix f) = fix (unL ■ f) (4.28) 

Ic (mfix /) = mfix (Ic • /) (4.29) 

rc (mfix f) = mfix (rc ■ /) (4.30) 

mfix (Ax. Fork (f x) (g x)) = Fork (mfix f) (mfix g) (4-31) 

Proof Similar to the proof of Proposition 14.3.11 □ 

Proposition 4.7.3 Equation 14.261 defines the unique value recursion operator for the 
tree monad. 

Proof Analogous to the proof for the list monad (Proposition 4.3.31 ) . Note that we 
need to use a different version of approx that works on trees [38] (see Appendix |B.6| ). For 
uniqueness, we cannot refer to distributivity, as the tree monad is not additive. (There is 
no appropriate unit element.) However, we still have the operator: x © y = Fork x y, 
which satisfies: 



x © y »= / = x »= / © y »= / 



hence a similar argument applies as in the case for the list monad. We leave the details 
to the reader. □ 
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Properties Sliding and nesting properties can be shown to hold for the tree monad, 



while strong sliding and right shrinking fails by Propositions 13.1.5 and 13.1.61 

4.8 Fudgets 

In this section, we will take a look at fudgets^ a monad that has been designed to model 
stream based computations. In its simplest form, the fudgets monad looks like: 

data Fudget a = Val a 

| Put Char {Fudget a) 
Get {Char — ► Fudget a) 

return = Val 

Val a »= / = / a 

Put c m ~^*= f = Put c {m /) 

Get h »= / = Get (Ac. h c »= /) 

We will model functional I/O using a simple interpreter over this data type:@ 

run :: Fudget a — ► String — > {String, a, String) 

run {Val a) inp = ("", a, inp) 

run {Put c m) inp = let (o, a, r) = run m inp in ('!' : c : o, a, r) 

run {Get /) j = error "trying to Get from an empty stream!" 

run {Get /) (c:cs) = let (o, a, r) = run (/ c) cs in ('?' : c : o, a, r) 

The function run accepts a fudget and an input stream, runs the computation and 
delivers the list of I/O operations that took place, together with the final value and the 
remainder of the input. The list of operations consists of all characters that are printed 
via Put (prefixed by ! ) , and all characters that are read from the input via Get (prefixed 
by ?). Note that the order is important, as it indicates the temporal relationship between 
I/O actions. For instance, we have: 

run {Put 'a' {Get (Ac. Put c {Val c)))) "123" = ("!a?l!l",'l',"23") 

For value recursion, we are interested in the meanings of fudgets of the form: 

mfix {Xxs. Put 'a' {Val (1 : xs))) (4.32) 



7 It would me more appropriate to call these "fudget-style stream processor monads," as the presentation 
here is only loosely based on the original work on fudgets by Carlsson and Hallgren [28, 29]. For brevity, 
however, we will continue using the word fudget. 

8 We will investigate Haskell's internal IO monad in detail in Chapter 21 
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which intuitively models a computation that will print the character a and then deliver 
an infinite list of l's. Or, more interestingly: 

mfix (Acs. Get (Ac. Val (c : cs))) (4.33) 

which will first read a character from the input stream (if available), and then return an 
infinite list containing copies of that character. 

One possible value recursion operator for the fudgets monad is given by: 



where 



mfix f = case / 1 of 

Val _ — > fix (/ • unVal) 

Put c _ — > Put c (mfix (unPut ■ /)) 

Get _ — > Gei (Ac. to/ix (unGet c ■ /)) 



unVal (Val a) = a 
unPut (Put _ to) = m 
unGet c (Get h) = h c 



(4.34) 



With this definition, Expression 14.32 yields: 

ran (m/w: (Aa». Put 'a' ( Va/ (1 : aw)))) "z" = ("!a", T, "z") 
where I denotes the infinite list of l's. The result indicates that there was one I/O action, 



which was printing the character 'a', and no input was consumed. Expression 4.331 yields: 

run (mfix (Acs. Get (Ac. Val (c : cs)))) "z" = ("?z", V, "") 

indicating that the character 'z' is read from the input, the infinite list of z's are returned, 
and all of the input was consumed. If the input stream was empty to start with, we would 
end up with the error case, i.e., the result would be undefined. 

So far, the behavior of mfix seems to be consistent with the way we perceive I/O. Here 
is a slightly more challenging expression: 

mfix (Ac. Put c (Val 'a')) (4.35) 

What should the result be? Two possibilities arise. If we consider Put as an action causing 
I/O, we see that it will not have its character ready for printing until after the computation 
proceeds. That is, we should have: 



run (mfix (Ac. Put c (Val 'a'))) "" = ("LL", 'a', "") 
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leaving the printed character undefined.^! Another option is to make the fixed-point value 
available throughout the whole computation, yielding:^ 

run (mfix (Ac. Put c (Val 'a'))) "" = ("!a", 'a', "") 

However, this alternative behavior is quite questionable. Consider the expression: 

mfix (Ac. Put c (Get (Xd. Val d))) 

In this case, we have to look past Get to determine what Put should print. However, 
this character is simply not available until we run this fudget with a particular input 
stream. Such an operator would violate the temporal relationship between Put and Get. 
(Furthermore, to achieve this effect, one would need to combine the operation of run and 
mfix, making the input stream available when the fixed-point is computed.) 

Proposition 4.8.1 Equation 14.341 defines a value recursion operator for fudgets. 

Proof Strictness and purity are immediate. Left shrinking can be established by 
induction. (As discussed briefly above, uniqueness is not guaranteed as we can speculate 
on the character to be printed whenever we have a Put constructor.) □ 

Properties Strong sliding and right shrinking both fail by Propositions 13.1.5 and 13.1.61 
Although we have not constructed the proofs, we believe that sliding and nesting properties 
should hold. 



As pointed out in Section I3.6L monad transformers allow construction of new monads 
from old ones. Although there is no magic recipe that will automatically lift a given mfix 
through a transformer, it is possible to do so in many practical cases. In this section, we will 
study three of the most common instances, namely error, environment, and state monad 
transformers. (For a discussion of the continuation monad transformer, see Section 15.21 ) 



9 The mfixwe have given in Equation 14. 34| produces this answer. As we will see in Chapter[8] the function 
fixIO, the value recursion operator for Haskell's 10 monad, behaves similarly. (See Example 8.2.21 ) 

10 Ignoring the Get constructor, the fudgets monad is very similar to the output monad of Section 14.51 
The second alternative corresponds to the function mfix^ (Eqn. 14.19] ), while the first one corresponds to 
mfix 0 (Eqn. 14.201) . It is possible to think of operators that correspond to mfix i when i 7^ 0 too. 



4.9 Monad transformers 



Liang defines the error monad transformer as follows 
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data Err a = Ok a | Err String 
type ErrT m a = m {Err a) 



return a = return (Ok a) 
m k = m ~^*= Xa. case a of 

Ok x — 
Err s — 



k x 

return (Err s) 



lift m = m ^= Xa. return (Ok a) 

Note that the return and ^= on the left hand side are the definitions for the new 
monad Err m, while those on the right belong to the monad m. If m has a value recursion 
operator mfixM, we can lift it up to Err m as follows: 



The similarities between Equations 14.36 and 14.21 are not accidental. The function 
unErr plays the same role as unjust, it providing access to the value part of the compu- 
tation. While the value recursion operator for the maybe monad uses fix (i.e., the value 
recursion operator for the identity monad) to tie the recursive knot, mfixErrM uses the 
value recursion operator for the underlying monad to do so. 

Proposition 4.9.1 Let mfixMbe a value recursion operator for a given monad m. The 
function mfixErrM, defined by Equation 4-361 is a value recursion operator for the monad 
ErrT m. 

Proof See Appendix |B.7j □ 

Let us now consider the environment monad transformer, which adds an immutable 
store to arbitrary monads. The definitions for the environment monad transformer are [53]: 

type EnvT pma = p^ma 



lift m = Xe. m 

If the underlying monad has a value recursion operator mfixM, we can lift it to the 
transformed monad as follows: 



mfixErrM :: (a — > ErrT m a) 
mfixErrM f = mfixM (f ■ unErr) 
where unErr (Ok a) 



ErrT m a 



a 



(4.36) 



return a = Xe. return a 

m ~^x= k = Xe. m e ^= Xa. k a e 



mfixEnvM :: (a — > EnvT p m a) — 
mfixEnvM f = Xe. mfixM (Xa. f a e) 



EnvT p m a 



(4.37) 
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The definition of mfixEnvM exactly mimics the value recursion operator for the envi- 
ronment monad (Equation 14.251 ), just like the case for the error monad transformer and 
the maybe monad. Analogous to Propisition 14.9.11 we have: 

Proposition 4.9.2 Let mfixMbe a value recursion operator for a given monad m. The 



function mfixEnvM, defined by Equation 14.371 is a value recursion operator for the monad 
EnvT p m. □ 

Finally, we consider the state monad transformer [53]: 

type StateT a m a = a — > m (a, a) 

return a = As. return (a, s) 

m k = As. m s ^= A(a, s'). has' 

lift m = As. to Ax. return (x, s) 

Applying the pattern we have seen with the previous two examples, a given mfixM can 
be lifted through the state monad transformer as follows: 

mfixStateM :: (a — > StateT a m a) — > StateT a m a 
mfixStateM f = As. mfixM (Ar. / (7Ti r) s) 

Proposition 4.9.3 Let mfixMbe a value recursion operator for a given monad m. The 



function mfixStateM, defined by Equation 14.381 is a value recursion operator for the monad 
StateT a m. □ 

Remark 4.9.4 The lifting given by Equation 4.381 behaves analogously to mfix^ as given 



by Equation 14.141 It does not seem possible to lift arbitrary value recursion operators so 
that they will behave similarly to any of the mfix i where i is finite (Equation 14.161 ) . 



4.10 Summary 

In this chapter we have considered a wide range of monads and value recursion operators 
for them. Although there is no magic recipe to automate the process, the examples provide 
sufficient detail to guide the construction of value recursion operators for new monads. 

There is one notable exception, however. The continuation monad does not seem to 
possess a value recursion operator. Chapter [5J contains the details. 

We summarize the properties of value recursion operators we have studied in this 
chapter in the following table, along with the 10 monad (studied in Chapter \E\). The last 
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column indicates whether the corresponding value recursion operator is unique. A cell 
marked with * indicates a conjecture. 





Str. 


Pure 


Left 


Slide 


Nest 


S. Slide 


Right 


Unique 


Identity 


/ 


/ 


/ 


/ 


/ 


/ 


/ 


/ 


Exceptions 


/ 


/ 


/ 


/ 


/ 


X 


X 


/ 


Lists 


/ 


/ 


/ 


/ 


/ 


X 


X 


/ 




mfix 0 


/ 


/ 


/ 


/ 


/ 


X 


X 




State 


mfixi 


/ 


/ 


/ 


/ 


X 


X 


X 


X 




infix,. 


/ 


/ 


/ 


/ 


/ 


/ 


/ 






mfix 0 


/ 


/ 


/ 


/ 


/ 


X 


X 




Monoids 


mfix { 


/ 


/ 


/ 


/ 


X 


X 


X 


X 




mfixu 


/ 


/ 


/ 


/ 


/ 


/ 


/ 




Environment 


/ 


/ 


/ 


/ 


/ 


/ 


/ 


/* 


Tree 


/ 


/ 


/ 


/ 


/ 


X 


X 


/ 


Fudgets 


/ 


/ 


/ 


/* 


/* 


X 


X 


X 


10 


/ 


/ 


/ 


/* 


/* 


X 


X 


/* 



Let us conclude this chapter by making several observations about value recursion 
operators: 

• We might hope that mfix constructs a fixed point value in the process of compu- 
tation. Unfortunately, in general, we cannot expect to find a value Zf such that 
mfix f = f Zf, Consider the function / xs = [1 : xs, 2 : xs]. There is no inte- 
ger value Zf such that / z/ = [1 : 1 : 2:2: ...], which is the required result 
in this case. Similarly, in the case of the state monad, the closest we can get is: 
As. / (fix (Xa. 7Ti (/ a s))) s, which shows that the state in which the recursive 
computation gets performed is essential in determining the final result. Similar 
comments apply to the expression mfix f = mf ^= / as well. 

• Similarly, one might hope for a morphism suppress :: m a — > m a, such that 

mfix f = suppress (mfix /) 3*= / 

The aim of suppress is to strip out effects. There are some monads for which such a 
morphism is available, but not in general. For instance, for the state monad: 

suppress f = Xs. let (a, _) = / s in (a, s) 
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Intuitively, suppress can only exist when there is a clear structural separation be- 
tween values and effects. For instance, such a separation seems impossible for the 
maybe or list monads 

• The equality infix f^>g=f-L^5>g does not hold in general. Since the value 
produced by mfix f is discarded, one might think that the recursive computation 
may be skipped as well. However, g might depend on the effects performed by the 
first computation, which might very well be different for mfix f and / _L. 

• It is worth reemphasizing some differences between fix and mfix. Recall that fix 
satisfies the equality fix (/•/)= fix /, for all /. However, it is not the case that: 
mfix f = mfix (Ax. / x /), unless / is pure. In general, this equation only 
holds when the underlying monad is idempotent (see Section 13.21 ). Similarly, the 
equation fix (/ • h) = f (fix (h ■ /)) translates into 

mfix (map h ■ f) = map h (mfix (f ■ h)) 

and requires / _L = / (h _L) (see Section 2.41 ). Most importantly, the defining 
equation for fix, fix f = f (fix f), simply does not have any counterpart in the 
value recursion world. The unfolding view of recursion is not suitable for explaining 
value recursion except for very mild effects (such as identity and environments), as 
it does not distinguish between values and effects at all. 



Chapter 5 
Continuations and value recursion 



Is there a value recursion operator for the continuation monad? Originally designed to 
model jumps, continuations come close to being the "universal" monad [ 1 24] , and their 
interaction with recursion proves to be quite intricate. In this chapter, we will take a 
closer look at the structure of continuations from the viewpoint of value recursion. 

Synopsis. We start with a review of the continuation monad, and continue by showing 
that a value recursion operator for continuations is highly unlikely to exist. After a brief 
discussion of the continuation monad transformer, we turn to first-class continuations, 
as found in Standard-ML and Scheme languages. We explore the interaction between 
recursive binding constructs and first-class continuations, showing that the left shrinking 
property is unattainable in such a setting. 

5.1 A monad for continuations 

Traditionally, continuation-passing style (CPS) has been used to model jumps in pro- 
gramming languages [90], Continuations provide an extremely powerful effect, especially 
first-class continuations as supported by SML of New Jersey and Scheme [311 34] , hence 
effective use of continuations require great care: As demonstrated by Thielecke, many 
seemingly obvious equivalences fail to hold in the presence of a call-by-current-continuation 
construct [84] . We will see a particular example related to recursion in Section 15.31 




Computations based on CPS can be described using monads. Wadler discusses monads 
for continuations in a typed setting [90], while Espinosa's thesis contains a discussion in 
the untyped world |22J . A typical continuation monad has the following structure: 







return x = Xk. 



k x 



m h 



Xk. 



m (Xv. 



h v k) 
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The type variable a encodes the result type. For any type a, continuation-based computa- 
tions with a result value of type of a are modeled by the monad Cont a. Other operations 
on continuations include run, which provides an initial continuation; abort, which ignores 
its continuation and immediately returns a result; and callcc, which enables saving the 
current continuation for later use: 

run :: Cont a a — > a abort :: cr — > Cont a a 

run m = m id abort e = Xk. e 

callcc :: ((r — > Cont a r) — > Cont a r) — > Cont a r 
callcc h = Xk. h (Xv. (Ac. k v)) k 

It is worth noting that run takes continuations of type Cont a a, i.e., the argument and 
the result types are the same. (Similarly, the result of abort is also restricted.) In callcc, 
the function h is given a handle to the current continuation k. If h uses its first argument, 
the control will be transferred to the point where callcc h was originally invoked. Note 
that the inner argument, Ac. k v, ignores its own continuation c, transferring the control 
back to k. Otherwise h might ignore its first argument, proceeding normally. 

Let us now turn to the question of value recursion for the continuation monad. Recall 
that a value recursion operator has type (a — > m a) — > m a, where m is the underlying 
monad. Expanding this type for continuations, we get 

mfix :: (a — > (a — > a) — > a) — > (a — > cr) — > a (5-1) 

where a is the type of answers. Following the general pattern for value recursion, we 
need to perform the fixed-point computation over a. However, it is simply not possible 
to obtain a plausible value of type a by only using the arguments to mfix. Indeed, we 
were not able to produce a plausible definition of mfix of even the correct type for the 
continuation monad, let alone a definition that would satisfy the required properties. 

Let us explore the situation a bit more closely. Being explicit about the quantification, 
we can rewrite Type 15.11 as: 

V<7.Va.(a — ► (ot — ► u ) — ► o~ ) — ► (ot — > o~ ) — > cr (5-2) 

What are the inhabitants of this type? Fixing an answer type a, we see that the Type 15.2 
is isomorphic to: 

V<x((a — > cr) — > a — > a) — > (a — > a) — > cr (5-3) 
and it is not hard to see that this type is (infinitely) inhabited if we have a fixed-point 
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operator. Each one of the following cases forms a class of inhabitants: 



infix f k 



f l (const v) _L a , i > 0, v € a 

f l k± a , i>0 (5.4) 

(fix f) _L a 



By v € a, we mean that u is an element of the domain that models the type a. Each mfix' 
gives rise to an mfix via the equation mfix = mfix' ■ flip, and vice versa] 1 ] 

We conjecture that the Equation set 5.41 completely covers all the inhabitants of 
Type 15.31 The proof attempt for such a claim would require an in-depth analysis of 
the type, and is beyond the scope of the current work. 

Conjecture 5.1.1 Let a be an arbitrary type. Every inhabitant of Type 15.31 falls into 
one of the categories given by Equation set 15.41 □ 

Proposition 5.1.2 None of the candidate definitions for mfix' gives rise to an mfix that 

would satisfy the purity law. 

Proof We will only prove the case 

mfix' f k = f l k _L a , i > 0 

Other cases are similar, if not simpler. Let a be a type and /ibea function of type a — > a. 
By purity, we must have: 

mfix (return ■ h) k = return (fix h) k = k (fix h) 

Fix a natural number i. By the chosen definition of mfix', we need: 

(flip (return ■ h)) 1 k _L Q = k (fixh) (5-5) 

It is easy to see that: 

(flip (return ■ h))* k = k ■ h* (5.6) 
Substituting 15.61 in 15.51 we get: 

k (h { ± a ) = k (fix h) (5.7) 

Obviously, Equation 15.71 does not hold for all k and h, given that i is a fixed natural 
number. □ 



1 The function flip is defined by the equation flip f x y = / y x. 
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Remark 5.1.3 By the previous proposition, we conclude that the continuation monad 
(as defined in Section 15.11 ) does not possess a value recursion operator, provided Conjec- 
ture 15.1.1 holds. 

The reader might wonder what happens if we restrict a to be the same as a in the 
Type 15.11 providing positive occurrences of a to work on. It is possible to show that there 
is an infinite family of candidate infix's in this case as well, but none of them satisfy our 
requirements. We leave the details to the interested reader. 

5.2 The continuation monad transformer 

The continuation monad transformer |53j is defined by: 

type ContT ama = (a^ma)^>ma 



Let mfixM be a value recursion operator for a monad m. Can we lift it through 
the continuation monad transformer, obtaining a value recursion operator for the monad 
ContT a m? Following the recipe set forth by the examples of Section 14.91 we are led to 
the following ill-typed definition: 

mfixContM :: (a — > ContT a m a) — > ContT a m a 



Since the argument to mfixM has type a — > m a, the application is ill-typed. This failure 
is hardly surprising, as setting m to be the identity monad would have resulted in a value 
recursion operator for the continuation monad. 

Remark 5.2.1 Magnus Carlsson has suggested that such a lifting might be possible 
when restricted to monads that support the notion of mutable variables (personal com- 
munication). In collaboration with Carlsson, we investigated a number of possible liftings, 
but none of our attempts were satisfactory. In each case, it was fairly easy to show that 
the required properties were violated. We conjecture that a viable lifting is not possible 
even in this restricted setting, leaving the exploration of this idea for future work. 



return a = Xk. k a 

m ^= / = Xk. m (Xa. f a k) 



lift m = Xk. m ^= k 



mfixContM f = Xk. mfixM {Xa. f a k) 



— ill— typedl 
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5.3 First-class continuations and value recursion 

What sort of properties can we expect from value recursion operators in a setting with 
first-class continuations? First-class continuations allow programs to seize the control 
state of their own evaluators [31] . This facility is definitely more powerful than what 
the continuation monad of Section 5.1 provides, where programs can only manipulate 
continuations that are explicitly created and passed around by the programmer. 

Many seemingly obvious equivalences fail to hold in the presence of first-class continu- 
ations. For instance, as shown by Thielecke, the equivalence (Xx. False) (k True) = False 
fails in the context callcc (Xk. []). (We refer the interested reader to Thielecke's work for 
many other interesting examples [84].) When we consider the equivalences dictated by our 
properties, we see that they are simply too strong to hold in a language with first-class 
continuations as well. For instance, consider the left shrinking property (Section 2.3| ), 
which states the following equivalence: 

mfix (Xx. a Xy. f x y) = a Xy. mfix (Xx. f x y) 

Recall that the computation represented by a does not use the recursion variable x (i.e., 
x is not free in a). However, in the presence of first class continuations, a can capture its 
continuation via a call to callcc, thereby getting a handle on / which uses x. That is, a 
can indirectly access x through /, breaking the left shrinking property. 

The following example in Scheme provides further insight into the problem. The 
example demonstrates that a simple equality between recursive and non-recursive bindings 
(even simpler than our left shrinking law) fails to hold in the Scheme case. (This example 
was brought to our attention by Amr Sabry, who traces it back to a message sent to the 
comp.lang. scheme newsgroup in 1988 by A. Bawden, titled "letrec and callcc implement 
references. ") Consider the following two Scheme expressions: 

(define (testl) 

(letrec ((x (call-with-current-continuation 

(lambda (c) (list #T c))))) 
(if (car x) ((cadr x) (list #F (lambda () x))) 
(eq? x ((cadr x)))))) 

(define (test2) 

(let ((x (call-with-current-continuation 

(lambda (c) (list #T c))))) 
(if (car x) ((cadr x) (list #F (lambda () x))) 
(eq? x ((cadr x)))))) 

Note that these two expressions are the same character for character, except the first 
one uses the recursive binding construct (letrec) of Scheme, while the second one uses 
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the non-recursive version (let). Intuitively, these expressions should evaluate to the same 
result, since the bound variable, x, is not even mentioned in the right hand sides of the 
bindings. Alas, these two expressions are not equivalent! When run, testl evaluates to 
#T, i.e., True, while test2 yields #F, i.e., False. Regarding this example, Bawden wondered 
if there were any ". . . real compilers that make this mistaken optimization" regarding that 
we might view test2 as an optimized version of testl. Of course, our concern is quite 
the opposite. We rather ask if there are any language constructs that might render the 
implied equivalence invalid. 

Understanding why these expressions yield different values requires an understanding 
of how Scheme is interpreted. We will try to convey the idea here as it is essential in 
understanding why the left shrinking property is likely to be too strong a requirement in 
the presence of first-class continuations. To keep the notation simple, let us rewrite these 
expressions in a more Haskell-like syntax:§ 



testl = 

letrec x = callcc (Ac. (True, c)) 
in if fst x 

then snd x (False, const x) 

else eq? x (snd x ()) 



test2 = 

let x = callcc (Ac. (True, c)) 

in if fst x 

then snd x (False, const x) 
else eq? x (snd x ()) 



Intuitively, letrec x = A in B in Scheme is implemented by allocating a cell called x 
with a bogus error value, computing the value of the expression A (with x in scope), and 
then overwriting the cell x with the result [44]. This allocate- compute- overwrite paradigm 
practically achieves the knot-tying implementation of recursion. The evaluation then goes 
on with the expression B, again with x in scope. A simple let binding, on the other hand, 
does not create a cell to start with: let x = A in B is interpreted by evaluating A, storing 
the result into a newly created cell x, and evaluating B with x in scope. With this model 
in mind, consider the letrec expression in the definition of testl: 

letrec x = callcc (Ac. (True, c)) in. . . 

To interpret this expression, one allocates a cell named x, and initializes it with _L. Then, 
the right hand side is interpreted. The crucial point is realizing what continuation is 
captured by the call to callcc. Recalling our description above, the following continuation 
will be captured: 



1. Let a be the argument passed to the continuation. Overwrite the cell x with a, 



2 The function eq? checks for pointer equality in Scheme, rather than structural equality. 
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2. Evaluate the expression part of letrec, i.e., evaluate: 

if fst x then snd x (False, const x) 
else eq? x (snd x ()) 

Let us call the continuation described above re. Now, the right hand side of the letrec 
binding is computed, which returns the tuple ( True, re). Since the definition is not actually 
recursive, the initial (undefined) value of x is not used. Then, the cell pointed to by x 
is overwritten by this tuple and the interpreter continues on with the evaluation of the 
body. Since fst x is True, the conditional takes its first branch. And it is exactly at 
this point that we invoke the continuation through the expression snd x, which is passed 
the argument (False, const x). Recalling the description of re above, this tuple overwrites 
the cell x. It is crucial to note the cyclic structure thus created: When called with an 
argument, the function stored in the second element of x will return a pointer back to the 
tuple itself. As dictated by step 2 of re, we now evaluate the body with this new value 
stored in the cell pointed to by x. But this time fst x is False, hence we end up evaluating 
the expression eq? x (snd x ()). Since, snd x () returns a pointer back to x, the call to 
eq? checks for the pointer equality of x and x, which simply results in the value True. 

What happens with test^P. Since we have a non-recursive let expression, the cell for 
x is not created before the right hand side is computed. Let us call this continuation <p. 
Here is our description of it: 

1. Let a be the argument passed to the continuation. Store a in a new cell called x, 

2. Evaluate the expression part of let, which is exactly the same as before. 

To evaluate test2, we proceed by computing the right hand side of the let binding. 
As before, we immediately get back the tuple (True, (f>). Now a new cell named x is 
created, which stores this tuple. The conditional again takes its first branch, and the 
continuation is called with (False, const x). Unlike the previous case, however, the call to 
the continuation creates a new cell named x, shadowing the earlier value of x: The cyclic 
structure is no longer available! It is not hard to see what happens now. The body is 
evaluated as before and the conditional takes its second branch. But this time we compare 
two different tuples in the call to eq?. Hence the result is simply False. 

The relevance of this example to the left shrinking property is obvious. Basically, the 
right hand side of the letrec binding, which is not recursive, corresponds to the constant 
computation in the left shrinking property. If left shrinking were to hold, we would be 
allowed to pull it out of the mfix loop, i.e., replace the recursive binding with a non- 
recursive one. As we have seen, at least in the Scheme case, such a transformation is not 
valid in the presence of first-class continuations. 
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5.4 Summary 

As we have seen, the continuation monad in Haskell (as defined in Section |5.1j) does 
not seem to have a suitable value recursion operator. A similar comment applies to the 
continuation monad transformer. Furthermore, in the case of first-class continuations, the 
properties we expect to hold for value recursion operators are simply too strong. 

Regarding the handling of recursive definitions arbitrarily mixed with computational 
effects in Scheme, Andrzej Filinski states (personal communication): 

...as far as I know, the only popular functional language that allows such defi- 
nitions is Scheme; and I believe that allowing them was a mistake. The extra 
generality is virtually never used, but it disallows some useful optimizations 
by unnecessarily constraining the implementation. It is well known that in the 
presence of call/cc, one can expose the imperative nature of letrec and use it 
to define a general mutable cell; any RnRS-conforming system must support 
this behavior no matter how it implements recursion... 



We share the same point of view. 



Chapter 6 



Traces and value recursion 



Trace operators were introduced into category theory by Joyal et al. , as a means for model- 
ing feedback operations arising in physics and mathematics |43j . Later work by Hasegawa 
bridged the gap between recursion and traces, establishing a one-to-one correspondence 
between fixed-point operators and traces over cartesian categories [91 CGS, \22[ GC2, [79]. Can 
we explain value recursion in this framework as well? The aim of this chapter is to review 
the recent research in this area, trying to gain a better understanding of value recursion. 

Synopsis. First, we will introduce parameterized value recursion operators, making the 
dependence on the environment explicit. After reviewing traced monoidal categories, we 
will show that value recursion operators give rise to traces for a restricted class of monads. 
Although the set of monads for which this is possible is quite small, the correspondence is 
strong enough for us to explore. The restriction arises as a consequence of trace axioms, 
which are simply too strong for value recursion in general. Motivated by this discussion, 
we will briefly review recent work by Paterson [66], and Benton and Hyland [5], which 
aims to generalize traces to premonoidal categories. 

6.1 Parameterized value recursion 

Recall that a value recursion operator for a monad m has type (a — > m a) — > m a. In a 
categorical setting, one needs explicitly to account for terms that contain free variables, 
i.e., variables that are defined in the enclosing environment. To do so, we parameterize 
our type to: 

((e, a) —* m a) — > (e — > m a) 

where e represents the environment. In the concrete case, e is generally a product, using 
the cartesian structure of the underlying language. Parameterized and non-parameterized 
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value recursion operators are interdefinable: 



pmfix £}a 

mfix a 
mfix a f 



((e, a) — » m a) — > (e — 
Ae. m/iXQ, (Aa. / (e, a)) 
(a — > m a) — > m a 
pmfix la (/ • vr 2 ) () 



m a) 



(6.1) 



(6.2) 



where 1 is the terminal object whose only element is written (). The choice for the terminal 
object is the natural one for e in Equation 16.21 as it represents the empty environment. 
In fact, any type would do, since the environment is simply ignored. 

Remark 6.1.1 Before proceeding further, a word on notation is in order. In this chapter, 
we will be using a more categorical notation where appropriate. For instance, types will be 
written with capital letters (as objects in a certain category), products will be written with 
x, etc. This shift is unfortunate, but necessary. We do not want to impose a Haskell-like 
notation when talking about categorical constructs: Such a coercion seems to complicate 
matters even more. As an example, the type for pmfix in |6.1 will be written: 



where T> is the category of domains and T is the underlying functor for the monad we are 
considering. (The notation T>(A, B) denotes the set of arrows from A to B in T>.) We 
will stick to Latin letters for objects, following the general practice. The use of particular 
letters (i.e., X for the recursion variable, and A for the parameter) is inherited from 
Hasegawa's work [33] . Also, we will use categorical products and function spaces, rather 
than Haskell's lifted versions. 

The second generalization we want to make is more technical than the first. Rather 
than considering the morphisms in the base category, we move to the Kleisli category of 
the given monad. There is one difficulty, however. The Kleisli category is not necessarily 
cartesian. More specifically, the binary operator inherited from the cartesian product of 
the base category is not necessarily bifunctorial. We will see the details and implications 
of this problem in Section 16.4.21 For the time being, let us just assume that we have 
a product-like operation in the Kleisli category, named x. Let T>t denote the Kleisli 
category of a given strong^ monad T over T>. It is easy to see that pmfix can be considered 

X A monad over a category with a monoidal operation ® is called strong if there exists a natural 
transformation tA,B '■ A(&T B — > T {A® B), called the strength, subject to certain conditions [63]. It turns 
out that all Haskell monads are strong, with the strength defined as t (a, tb) — tb Xb. return (a, b). 



pmfix AX : V(A x X, T X) -» V{A, T X) 
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as a family of functions with the type: 

pmfix AX : V T (A x X, X) -» X>r(A, X) (6.3) 

If 2>r is cartesian, Type 16.31 is precisely the same as that of a Conway operator (see 
Appendix SJ. This view of value recursion will prove essential in the following discussion. 



6.2 Preliminaries 

In this section, we review the central notions in Joyal et al., and Hasegawa's work p3l [43], 
covering symmetric monoidal categories, traces, and the correspondence between traces 
over cartesian categories and Conway operators. 



6.2.1 Symmetric monoidal categories 

In computer science, we often deal with binary operators that are associative only up to 
isomorphism. Monoidal operators and monoidal categories provide a setting where such 
operators can be modeled formally [2] [55] : 

Definition 6.2.1 (Symmetric Monoidal Category.) A symmetric monoidal category, 
SMC for short, M = (M, ®, I, a, I, r, s) is a category M with a bifunctor ® : MxM — > M, 
an object I £ «M, and natural isomorphisms: 



A <8> (B ® C) -> (A ® B) 
A ® B -> B ® A 



such that the following diagrams commute: 
Associativity Pentagon: 

B)) (A ® B) ® (<7 ® D) 

A®{(B®C)® D) 

Unit triangles and symmetry: 

A <g (I ® J3) 



''.4 



.4 



A® B 



(A (g) J) g> B 




A<gB 




{{A ® B) (g C) (g B 

(A (g (B (g C)) (g B 
A®/ 
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Bilinearity: 



A <8> (B <g> C) 
A® (C® B) 



(A ® 5) ® C ■ 
(A ® C) <g> B ■ 



C (8) (A® B) 

o 



Example 6.2.2 All cartesian categories are symmetric monoidal. Let C = (C, x,l) be 
a cartesian category where x is the direct product with projections tt\ : A x B ^ A and 



7T2 : A x S — ► B. In this case, the natural isomorphisms of Definition 16.2.1 are given by: 



-i 



a = ((7Tl,7ri • 7T 2 ),7T 2 • 7T2) / = 7T 2 r = 7T1 S = 

(7ri-7ri,(7r 2 -7ri,7r 2 )> ZT^iUA) rl 1 = (A,\ A ) 



(vr 2 ,vri) 



(7T 2) 7r 1 ) 



where \a '■ A — > 1 denotes the unique map to the terminal object. In Haskell notation 
these morphisms correspond to the following functions (with more suggestive names): 



assoc (x, (y, z)) = ({x, y), z) 

left ((), y) = y 

right (x, ()) = x 

swap (x, y) = (y, x) 



assoc 1 ((x, y), z) = (x, (y, z)) 

left-\ y = ((), y) 

right 1 x = (x, ()) 

swap' 1 (y, x) = (x, y) 



6.2.2 Traced symmetric monoidal categories 

Trace operators provide a categorical framework for studying cyclic structures:^ 

Definition 6.2.3 [Traced SMC.) A traced symmetric monoidal category is a symmetric 
monoidal category M. = («M, <S>, I, a, I, r, s) with a family of functions: 

Tta,b ■ M{A ®X,B®X) -» M(A, B) 

subject to the following conditions: 

• Naturality in A (left tightening): 

Tr x 

A M(A®X,B®X) — *M(A,B) 



M{g®X,B®X) 

M(A' ®X,B®X) 



rrt X 

Tr A , 



M(g,B) 

■M(A',B) 



For all / : A' ® X -» B <g> X, g : A —> A', Tr (/ • (g ® A)) = Tr f ■ g. 



2 The original work on traces was presented in the slightly more general setting of braided monoidal 
categories [43] . Following Hasegawa [33] , we only consider symmetric monoidal categories here. 
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Naturality in B (right tightening): 
B M(A®X,B®X) 



Trl B 



B> 



M{A®X,g®X) 

M{A®X, B'®X) 



■M(A,B) 

M{A,g) 
■M(A, B') 



For all / : A ® X -» B ® X, 5 : S 
Dinaturality in X (sliding): 



B', Tr((g®X)-f)=g - Trf. 



X M{A®X',B®X) 

9 M{A®X',B®g) 

X' M(A®X',B®X') 



M{A®g,B®X) 



■M(A®X,B®X) 



rrt X 

Tr A , 



Tr 



X 1 



M(A, B) 



For all / : A ® X' 
Vanishing: 

- For all / : A - 



B <g> X, g : X X', Tr (/ • (A ® <?)) = Tr ((5 ® </) • /). 



B, Tr I AB (r- 1 -f-r) = f. 



- For all / : A ® (X ® y) -> 5 ® (X ® F), 



(a • / • a' 1 )) = IVjf / 



• Superposing: For all / : A ® X — > i? ® X, 



Tr£ 



(a-(C®/)-a- 1 ) = C® 3>* B /. 



• Yanking: For all X, Trf x (s x ,x) = X. 

The graphical versions of these axioms are given in Figure 16.1 [331 [33]. It is worth 
comparing these diagrams to those that we have given in Chapter \2\ for m/ix. The thick 
lines in the figures for mfix represent monadic actions, i.e., side-effects, changes in the 
state, etc., while the corresponding lines in Figure 16.1 represent data flow. The fixed- 
point argument (i.e., X) is not directly available to the outside world in the formulation 
of trace (although this limitation can be easily circumvented). In mfix, however, the result 
is the fixed-point value together with monadic actions. 



6.2.3 Traces and Conway operators 

The following theorem of Hasegawa (also independently established by Hyland) states the 
connection between traces and Conway operators (See Appendix \A\ for a brief review of 
Conway operators): 
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5 



Left Tightening 




Sliding 



Superposing 



1 — I— 








/ 






Right Tightening 



/J— 



4 / f- 



Vanishing 




Yanking 



Figure 6.1: Trace Axioms 



Theorem 6.2.4 (Hasegawa, Hyland) A cartesian category is traced exactly when it 
possesses a Conway operator. 

Proof See Theorem 7.1.1 of Hasegawa's thesis |32j . □ 

The correspondence can be summarized as follows. Assuming we have a trace operator 
Tr, we can define a Conway operator (•)' : C(A x X, X) — ► C(A, X) as follows: 

/ + =Z>3U(Ax-/) : (6.4) 

Similarly, given a Conway operator (•)', we can define the following trace operator: 

Tr$ }B f = ir 1 -f-(A,(K 2 -f?) : A^B (6.5) 

Since Conway operators provide a generalization of fixed-point operators on domains, 
traces on symmetric monoidal categories provide a firm categorical framework for studying 
fixed-point operators. 

Example 6.2.5 In the setting of domains and continuous functions, the unique least 
fixed-point operator for a function / : A — > A is given by: 

fixf = [_\ f ± A 

i 

which gives rise to the following Conway operator: Given f : A x X ^ X, 

a = fix (Ax./ (a, x)) : A —> X 
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And, by Equation 16. 51 we obtain the following (unique) trace operator: Given f : AxX ^ 
B x X, 

Tri jB f = 7r 1 -f-(A,Xa.fix(Xx.TT 2 (f(a,x)))} : A -> B 
In Haskell-like notation, this definition simply reads: 

trace :: ((a, 7) — > 7)) — > a — ► /? 

trace f a = let (6, x) = f (a, 2;) (6-6) 
in 6 

which clearly shows the intent: The recursive knot is tied over x, leaving a function of 
type A — > 1? as the residue. 

6.3 Traces and value recursion 

As we have seen in the preceding section, traces provide a natural framework for studying 
fixed-point operators, and by virtue of Theorem 16.2.41 the usual notion of recursion can 
be explained by traces over cartesian categories. Does the correspondence hold up when 
we consider value recursion? It turns out that a close relationship can be established 
for commutative monads whose Kleisli categories are traced, but the trace axioms are 
simply too strong for the general case. Still, we will explore this limited correspondence 
closely, as it will help us identify the problems that arise in the general case. We start by 
examining two particular classes of monads: commutative monads and monads based on 
commutative monoids. 

6.3.1 Commutative monads and traces 

Let T be a strong commutative monad over an SMC M. = (A4,<E>,I,a,l,r,s) with the 
given strength t. We write rj for the unit, and (i for the multiplier of T. The monoidal 
structure over M. carries over to the Kleisli category of T, denoted Mt, as follows: 

Mr = {Mt, <S>', I, V ■ a, 7] • I, Tj • r, rj ■ s) 

The monoidal operator <8> is lifted to A4t as follows. On objects, (8)' is defined to be the 
same as (8). On arrows, / ®' g is defined to be the arrow O • (/ <S> g) in M, where 

0 : T A®T B (A®B) 
0 = n-Tt-t' 

Recall that t' is the dual of t, given by T s ■ t ■ s. Since T is commutative, the other 
candidate for 0, i.e., fi-T t' -t, yields exactly the same arrow. 
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In the case of CCC's, a trace operator on the Kleisli category of a commutative monad 
gives rise to a parameterized value recursion operator on the underlying category. To see 
this, let C be a CCC, and T be a commutative monad over C. If Ct is traced, we have a 
family of functions: 

Tr A,B ■ Ct(A x'I.Bx'I)^ C t (A, B) 
which implies the existence of the following family of functions in C: 

Tr% >B : C(A xX,T (Bx X)) -» C(A, T B) 

Hence, a candidate parameterized value recursion operator can be defined by setting: 

pmfix EA f = T4 A (T A a ■ f) (6.7) 

where A a = (a, a). 

Example 6.3.1 The environment monad provides a nice example of obtaining a value 
recursion operator from a trace. For a fixed object E in a CCC, the functor T A = E A, 
i.e., the exponentiation functor with the first argument fixed, gives rise to the environment 
monad. For convenience, we will stick to the Haskell notation. The monad structure and 
the strength are given by: 

return a = Xe. a 
join f = Xe. fee 
t (a, f) = Xe. (a, / e) 

It is easy to see that T is commutative. The Kleisli category is traced, and the 
corresponding family of functions in the base category is given by: 

trace :: ((a, r) -> (E -> (fi, r))) -> a -» (E -» (3) 
trace f a = Xe. let (b, x) = f (a, x) e (6-8) 
in b 

Using Equations 16.71 and 16.21 we get: 

mfix :: (a — > (E — > a)) — > E — > a 

m/?£ / = Ae. let (6, x) = {map (Xz. (z, z)) ■ f ■ 7^) ((), x) e 
in b 

Recalling map f g = f ■ g for the environment monad, we can simplify this definition to: 

mfix f = Xe. let x = f x e 
in x 
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which is precisely the value recursion operator that we have given in Section 14.61 for the 
environment monad. 

Example 6.3.2 This example demonstrates that having a commutative monad is not 
sufficient to guarantee the construction of a value recursion operator: The corresponding 
Kleisli category should be traced as well. As an example, consider modeling exceptions in 
Set by disjoint sums, using the endofunctor T A = 1 + A, where 1 is the terminal object. 
In Haskell-like notation, the monad structure and the strength are given by: 

rj a = inr a 

t (a, inl ()) = inl () 

/x (inl ()) = inl () 

t (a, inr b) = mr (a, b) 

\i (inr a) = a 

It is easy see that T gives rise to a commutative monad, and hence its Kleisli category 
is symmetric monoidal. If Set? is traced, then we should have a family of arrows Tr* B : 
SetT(A®X, B®X) — » Setx(A, B), where <g> is the lifting of the cartesian product. Hence, 
we must have a family of arrows Set(A x X, 1 + (B x X)) — > <Se£(A, 1 + £>). However, 
since the computation might fail, we do not have a way of getting an X to tie the recursive 
knot. In this case, the Kleisli category does not seem to possess a trace. 

The reader might appreciate the situation in Haskell. The exception monad is the 
usual Maybe monad, except the Haskell version is not commutative (due to the possibility 
of non-termination). Ignoring the non-termination issue for a moment, we would need to 
find a trace operator with the type: 

((q, r) -> Maybe (/?, r)) -> a -> Maybe (3 

Here, r is the recursion argument, on which we need to tie the recursive knot. However, 
the required trace operator just does not exist, since we are not guaranteed to get a r to 
form the required recursive loop. (Recall that there is no such problem for value recursion 
in our setting — see Section \A72\ for details.) 

6.3.2 Monads arising from commutative monoids 

In Section |4.5[ we explored monads that arise from monoids. In this section, we will 
concentrate on those monads that are obtained from commutative monoids, and see how 
a trace operator in the underlying category can be used to obtain a value recursion operator 
for the corresponding representation monad. 

The usual definition of monoids on sets can be generalized to arbitrary monoidal 
categories [55] . Let Ai = (M, ®, /, a, I, r, s) be a symmetric monoidal category. A monoid 



75 



in M is a triple M = (M, +, e) where M £ M, + : M (g> M -> M, e: I -> M, with the 
usual associativity and unit laws. (The monoid is commutative if + -s = +, i.e., if the order 
of arguments to + does not matter.) For such a monoid, the endofunctor T A = M £g> A 
gives rise to the following strong monad, known as M's representation monad [2]: 

Va = (e^A)-^ 1 (6.9) 
H A = (+®A)-a MMA (6.10) 

t A,B = a ~M, A ,B ■ ( S A,M ® B ) " a A,M,fl l 6 ' 11 ) 

If M is commutative, then T will be commutative as well. 

After all this machinery, we can finally state our goal. Let Ai be a traced SMC, M 
be a commutative monoid in A4, whose representation monad is T. As we have seen, T 
is commutative and hence its Kleisli category is symmetric monoidal. Furthermore, the 
trace on M lifts into Mr, i.e., M T is also traced. If Tr\ B : M(A®X, B&X) -> M(A, B) 
is the trace operator on M, the trace operator on Mt is given by: 

Ti>\ B : M T {A®' X,B®' X) ^ M T (A,B) 

Tr'lsf = Tri M ® B (a-f) (6.12) 

Example 6.3.3 Consider the monoid: (N,+,0), where N is the flat domain of natural 
numbers, and + is addition. The corresponding functor is: T A = N x A. As outlined 
above, the monad structure is given by (in Haskell): 

return x = (0, x) 

join (m, (n, x)) = (m+n, x) 

t (x, (m, y)) = (m, (x, y)) 

t' ((m, x), y) = (m, (x, y)) 

Since map f (m, x) = (m, f x), we have: 

(join ■ map t' ■ t) ((m, x), (n, y)) = (n+m, (x, y)) 
(join ■ map t ■ t') ((m, x), (n, y)) = (m+n, (x, y)) 

Hence, the commutativity follows from the commutativity of +, as promised. Recall from 
Example 16.2.5 that the trace on the underlying category is given by: 

trace f a = let (b, x) = f (a, x) in b 

which, by Equation |6.12[ can be treated as a trace operator on the Kleisli category of T 
with the type: (A x X — > N x (B x X)) — > (A — > N x B). More explicitly, we have (where 
we use Integer to represent N) : 
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tr' :: ((a, a) — > {Integer, ((3, a))) — > (a — > (Integer, /?)) 
tr' f a = let (6, x) = (assoc ■ /) (a, x) in 6 

By Equation 16.71 we obtain the following parameterized value recursion operator: 

pmfix f = Xa. let (6, x) = (assoc • map (Ax. (x, x)) • /) (a, x) 
in 6 

which gives rise to the following value recursion operator by Equation 16.21 

m/ix :: (a — > (Integer, a)) —* (Integer, a) 

mfix f = let (6, x) = (assoc ■ map (Ax. (x, x)) • /) x in 6 

By expanding the definitions and simplifying, one obtains: 



in (n, x) 

which is precisely the value recursion operator we have given for monads based on monoids 
(Equation 14.191 ) in Section 14.51 

Remark 6.3.4 It is important to note that the commutativity of the monoid does not 
play any role in establishing the requirements of value recursion, although it is essential 
for constructing a trace. If the monoid is not commutative, the representation monad will 
not be commutative either, failing to yield a monoidal structure on the Kleisli category. In 
that case, one cannot even talk about the notion of trace, as Definition 16.2.31 only applies 
to symmetric monoidal categories. We will return to this issue in Section 16.4.21 

6.3.3 The correspondence 

We now turn to the correspondence between value recursion operators for commutative 
monads and trace operators over Kleisli categories. Before doing so, we will need to 
consider what trace axioms mean in the Kleisli category of a given monad. Let T be the 
monad under consideration. In this setting, the trace axioms read:§ 

3 In these equations, we use the Haskell notation and try to name variables according to their types, 
i.e., a variable named a is of type A. Note the use of shadowing in A-bindings, where we reuse variable 
names to stick to our convention. Compared to the original trace axioms, these versions are indeed very 
ugly to look at, but they are much more intuitive from a programming perspective. Also, to save space, 
we use r\ to abbreviate return. 



mfix :: (a —* (Integer, a)) 
mfix f = let (n, x) = f x 



(Integer, a) 
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• Left tightening: For all / : A' x X -» T (5 x X), g : A -> T A', 

Tr (A(o, x). 5 a »= Aa'. / (a', x)) = Aa. 5 a »= Tr / (6.14) 

• Right tightening: For all / : A x X -> T (B x X), 5 : 5 -> T 5', 



Tr (A(o, a). / (a, x) »= A(o, 1). j J » A6'. rj (b\ x)) 
= Aa. Tr f a ^= 5 

Sliding: For all f : A x X 1 —> T (B x X), g : X ^ T X', 

Tr (A(a, x). g x S== Ax'. / (a, x')) 
= Tr (A(a, x'). / (a, x') >= A(6, x). 5 x >= Ax', 77 (6, x')) 



Tr {Tr (A((a, x), y). f (a, (x, */)) >^ X(b, (x, y)). 77 ((&, x), j/))) 
= Tr / 

Superposing: For all / : A x X -> T (5 x X), 

Tr (A((c, a), x). / (a, x) »= A(6, x). rj ((c, 6), x)) 
= A(c, a). Tr f a A6. 77 (c, 6) 



.15) 



3.16) 



• Vanishing: For all / : A — > T 5, 

Tr (A(o, ()). / a »= A6. V (b, ())) = / (6.17) 
and, for all f : A x (X x Y) —> T (B x (X x Y)), 



3.18) 



3.19) 



• Yanking: 

Tr (A(xi, x 2 ). 77 (x 2 , xi)) = 77 (6.20) 

After these preliminaries, we can finally state the main result of this chapter: 

Proposition 6.3.5 Let T> be the category of domains, and T be a commutative monad 

over T>. Let mfix be a value recursion operator for T, further satisfying strong sliding, 
nesting, and right shrinking laws. Then, the family of functions 

traced B : V{A x X, T (B X X)) -» V(A, T B) 

trace f = Aa. mfix BxX (^(P, x). f (a, x)) ^= r\ ■ tvx (6-21) 



will satisfy Equations 6. 14-6.20. i.e., it will provide a trace operator for T>t- 
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Proof See Appendix IB. 8 for the full derivation. We try to summarize the correspon- 
dence at a higher level here. Unsurprisingly, left and right tightenings depend on the left 
and right shrinking properties of mfix respectively. Sliding requires the use of Proposi- 
tion which depends on the commutativity of the monad and strong sliding (of mfix). 
The first vanishing rule depends on left shrinking and purity, the second one also uses 
nesting. The superposing rule only needs pure right shrinking (which is guaranteed by 
right shrinking) . Finally, yanking is a direct consequence of purity. □ 

Remark 6.3.6 Ideally, we should also establish that a trace operator on the Kleisli 
category of a commutative monad yields a value recursion operator, using a translation of 
the form: 



But we will refrain from pursuing the correspondence in this direction for the following 
reasons: 

• Our treatment of value recursion operators takes place in the setting of continuous 
functions over domains. On the other hand, trace operators are presented in the 
abstract setting of monoidal categories, hence the assumptions for the underlying 
structure are significantly weaker. For instance, it is not clear what our strictness 
axiom (i.e., / -L a = _I_t a iff fnfix a f = J-t a) would correspond to in this setting.0 

• As we explored above, the correspondence of traces and value recursion is rather 
limited. Very few monads are commutative, and even fewer have their Kleisli cate- 
gories traced. What we should seek, then, is a notion of trace in the non-monoidal 
case. In short, trace axioms are just too strong for value recursion. 

6.4 Dropping the monoidal requirement 

As we have seen in the preceding section, the trace-based categorical account of fixed- 
point operators falls short of explaining value recursion for all but a very restricted set 
of monads. Is it possible to generalize the theory of traces so that we can accommodate 
value recursion more satisfactorily? In this section, we will briefly review two recent 

4 Hasegawa suggests that it might be possible to study strictness via the notion of uniform trace opera- 
tors. See Proposition 7.1.4 in Hasegawa's thesis [33]. 



mfix A 



(A^T A) T A 



mfix f 
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attempts in this direction. First, we will look at Paterson's work, which lifts mfix to 
the world of arrows [66]. Second, we will review Benton and Hyland's work on traced 
premonoidal categories [5]. It turns out that both attempts describe essentially the same 
axiomatization, although presented in slightly different contexts. The general idea is to 
move to premonoidal categories, effectively dropping the monoidal requirement .@ 

6.4.1 Arrows and loop 

Hughes introduced arrows as a generalization of monads, making the input-output flow 
more explicit [36] . An arrow =>■ is a binary type constructor equipped with: 

arr : [a — ► 0) — ► [a => 0) 

^> : (a =>• 0) {0 7) -> (a =>• 7) 

first : (a =>■ 0) —* (a x 7 =>• /3 x 7) 

Intuitively, a =>■ j3 represents a computation that receives an input of type a, performs 
a computation with possible side effects, and delivers a result of type j3, corresponding 
to what an imperative programmer might call a procedure. The morphism arr makes a 
procedure out of a pure function, while ^> runs two procedures in sequence, threading the 
result of the first to the second. The function first lets information to be passed around for 
later use, mainly used for storing results of intermediate computations. The morphisms 
arr, 3g>, and first are required to satisfy a number of laws, similar to monad laws. 

Example 6.4.1 Arrows generalize monads in the following sense. For every monad m, 
the type Kleisli m gives rise to an arrow, where: 

type Kleisli mT<j = T^ma 

arr f = return ■ f 

f ^ g = \a. f o »= g 

first f = X(a, c). / a Xb. 7] (b, c) 

Paterson argues that Power and Thielecke's Freyd categories are equivalent to Hughes's 
arrows [73j . (We will briefly review Freyd categories in Section 16.4.21 ) 

Is there a corresponding notion of value recursion for arrows? Paterson generalizes 
mfix to arrows, introducing the following loop operator |66j : 

loop :(ax7=^/?X7)^a^>^ (6.22) 



We mention in passing that Jeffrey also used the so-called partial traces (ordinary traces that are 
restricted to be applied only to certain maps) to model flow graphs and recursion in programming lan- 
guages [39]. We will not review his work here, however, as it is not directly related to value recursion. 
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Note the similarity between this type and the type of trace operators (Definition 16.2.31 ). 
As expected, value recursion operators give rise to loop operators for the corresponding 
Kleisli arrows. Given a value recursion operator mfix, Paterson defines: 

loop f = map it i ■ mfix ■ f 

(6.23) 

where f x y = f (x, ir 2 y) 

which can be shown to be equivalent to the function we have given for obtaining a trace 
operator from mfix (Equation 6.21) . 

Paterson generalizes the trace axioms of Section 16.2.2 for loop, and adds a law called 
extension, similar to our purity property. As expected, he weakens the sliding axiom 
so that the function moved over is of the form arr k for some function k, syntactically 
guaranteeing purity. Unlike our sliding property for mfix, however, Paterson does not 
require a further guarding equation to regulate the behavior on _L (i.e., the antecedent in 
Equation |2.5| ). Similarly, right tightening is postulated as an axiom as well. Therefore, 
the failure of strong sliding or right shrinking properties for the underlying mfix will cause 
the trace axioms to fail. Similar comments apply to arrows that are not derived from 
monads as well. Paterson makes similar observations, although he does not weaken his 
axiomatization to accommodate accordingly [66] . 




6.4.2 Traced premonoidal categories 

Closely related to Paterson's work is Benton and Hyland's recent generalization of traces 
to premonoidal categories |5]. As we have seen throughout this chapter, the crux of 
the problem lies in the monoidal requirement that comes with traces. Motivated by 
this observation, Benton and Hyland generalize traces to premonoidal categories. If the 
category is indeed monoidal, their definition simply reduces to the usual definition of traces 
over monoidal categories. 

Let us review the problem with the monoidal requirement more formally. What hap- 
pens when the monad is not commutative? Let C be symmetric monoidal, with ® as 
the monoidal operation. Let T be a strong monad over C, with strength t. We do not 
assume that T is commutative. Consider the Kleisli category of T, Ct- For clarity, we 
will use the symbol — 1 to denote arrows in Ct- The symmetry in C lifts into Ct with 
no problems, i.e., Ct is symmetric as well. For any fixed object A, we have the functor 
A ® - in Ct, mapping a given object B to A ® B and an arrow / : B — B' to the 
arrow t ■ (A ® f) : A® B — > T {A® B') in C, which corresponds to the required arrow 
A ® B — A ® B' in Ct- It is easy to see that - ® A also yields a functor in Ct- However, 
® is not a bifunctor, unless T is commutative. To see this, let / : A — v A' and g : B — B' 
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be two arrows in Ct- There are two ways of obtaining the arrow / <g> g : A® B A' <8> B', 
as captured by the following Haskell expressions: 



The first composition is denoted by K, i.e., fKg = A'(&g-f<giB. Similarly, we define 
/ * 9 = f ® B' ■ A <gi g. Unless the monad is commutative, these two compositions 
are generally different, as they reflect the order in which / and g are performed. This 
discrepancy is the main reason why the monoidal structure in the base category does not 
lift to a monoidal structure in the Kleisli category. 

Of course, even when we only have a non-monoidal operation, there might exist a 
subset of arrows for which the order does not matter — think of / (or g) having the form 
return- h in the expressions above. Such arrows are called central. More formally, an arrow 
/ is central if, for all g, f K g = f x g, and g x / = g x /. 

Generalizing from this example, Power and Robinson introduced premonoidal cate- 
gories [72] . In short, a premonoidal category is just like a monoidal category, except the 
binary operation is only required to be functorial in each of the variables separately. (Note 
that every monoidal category is trivially premonoidal.) As we have sketched above, Kleisli 
categories of strong monads are examples of premonoidal categories. 

Given a symmetric premonoidal category, can we come up with a notion of trace? 
Recall that traces are only meaningful in symmetric monoidal categories. Naively, one 
might hope that the definition of trace (Definition |6.2.3| ) might very well apply in this 
case as well. Unfortunately this is not the case: 

Proposition 6.4.2 (Benton, Hyland [5]) A symmetric premonoidal category with a trace 



As expected, the sliding axiom causes the trouble. Benton and Hyland show that 
sliding implies f k g = f x g for all arrows / and g, establishing that the category 
is indeed monoidal. To remedy the situation, Benton and Hyland generalize traces to 
centered symmetric premonoidal categories. A centered symmetric premonoidal category 
is a premonoidal category /C, with a distinguished monoidal center Ai, and an identity- 
on-objects strict symmetric premonoidal functor J : A4 — > JC [ 172] . For our purposes, it 
suffices to consider M. as a subcategory of IC, where all arrows in A4 are central. 

Kleisli categories of strong monads over symmetric monoidal categories are classical 
examples of premonoidal categories. Let A4 be symmetric monoidal, and let T be a strong 
monad over Ai. As we have seen above, Mt is symmetric premonoidal. Recall that a 
Kleisli category has the same objects as the base category. Let the functor J : Ai — > A4t 



X(a, b). f a »= Xa'. g b >= Xb' . return (a 1 , b') 
A(a, b). g b »= Xb' . f a Xa' . return (a 1 , b') 



(Definition 16.2.31 ) is actually monoidal. 



□ 
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be defined as follows. On objects, J is the identity. Given an arrow / : A — ► B, let 
J f = rjb • f : A B, where r\ is the unit of T. Then J : A4 — > A^t is a centered 
symmetric premonoidal category, with the distinguished monoidal center A4. (Of course, 
J is nothing but the usual inclusion functor.) In this case, a central arrow in A4t is simply 
any arrow that is lifted from the monoidal center, i.e., any arrow that factors through ry 
in the base category. 

The intuitive understanding of a centered symmetric premonoidal category J : A4 — > fC 
is as follows: K, is considered to be the category where arrows denote computations, 
possibly with observable effects. As expected, K, does not possess a monoidal structure. 
A4, on the other hand, is a subcategory of /C denoting values, i.e., where we can swap 
the order of computations, duplicate values only to discard later, etc. A crude analogy in 
programming terms is given by any "almost" functional language: For instance, think of 
K, as corresponding to the Standard-ML language, containing references etc., and AA as 
the purely functional subset of Standard-ML. 

Getting back to traces, Benton and Hyland define [5]: 

Definition 6.4.3 (Traced centered symmetric premonoidal categories.) A trace on a 
centered symmetric premonoidal category J : A4 — > K, is a family of functions: 

Tr\ B : K(A ®U,B®U)^ JC(A, B) 

satisfying the same conditions as given in Definition 16.2.31 except (i) the sliding condition is 
weakened such that g is assumed central, and (ii) given a central arrow / : A<S> X — > B<S)X, 
Tr\ B f : A — > B is required to be central. 

It is easy to see that this definition generalizes the notion of trace, since all arrows are 
central in a symmetric monoidal category. 

In order to generalize Theorem 16.2.41 Benton and Hyland also develop the notion 
of Conway operators on Freyd categories. Briefly, a Freyd category is a symmetric pre- 
monoidal category J : C — > fC, where C is cartesian [73]. A parameterized fixed point 
operator on a Freyd category J : C — > K is defined to be a family of functions 

(■)\ x :K.{A®X,X) -+IC(A,X) (6.24) 

Benton and Hyland require (■)* to satisfy the so-called center preservation, naturality, and 
central fixed-point properties, corresponding to our left shrinking and purity laws. 

To be able to establish a correspondence between traces over Freyd categories and 
parameterized fixed point operators, Benton and Hyland define Conway operators, which 
further satisfy laws that correspond to our right shrinking and nesting properties. Hence, 
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similar to Paterson's axiomatization of loop, the correspondence with premonoidal traces 
only holds for the set of value recursion operators that further satisfy strong sliding and 
right shrinking properties. As we have seen in Section 13.11 these two properties are unsat- 
isfiable for value recursion operators in general (Corollary 13.1. 7] ). 

6.5 Summary 

In this chapter, we summarized the notion of traces from category theory, and investigated 
how value recursion might fit into the picture. As we have seen, for a very small class of 
monads, value recursion operators correspond to trace operators over Kleisli categories. 
The environment monad is the most important example exhibiting this correspondence 
(other than the obvious identity monad). In the general case, however, the correspondence 
fails because of the monoidal requirement in the formalization of trace operators. 

It turns out that Paterson's loop axioms and Benton and Hyland's generalization of 
traces to premonoidal categories are essentially the same, although developed indepen- 
dently and presented in slightly different contexts [5, 66]. Both these axiomatizations take 
the correspondence one step further, but not to the point where a satisfactory theory for 
value recursion can emerge. To summarize, both require right shrinking and strong slid- 
ing properties, which are known to be unsatisfiable for many monads (see Chapter [3J. In 
terms of concrete monads, their work can handle the lazy state and the output monads, 
but not exceptions, lists, strict state, and the 10 monad of Haskell. In this respect, we 
consider both attempts to be significant steps in understanding and generalizing value 
recursion, but not the final categorical account of the whole problem. 



Chapter 7 



A recursive do-notation 



Haskell's do-notation simplifies monadic programming significantly, but it lacks support for 
recursive bindings, a key syntactic feature for value recursion. In this chapter, we describe 
an enhanced translation schema for the do-notation and its integration into Haskell.@ 
The new translation will allow variables to be bound recursively, provided the underlying 
monad comes equipped with a value recursion operator. 

Synopsis. We start with a motivating example, showing the need for recursive bindings 
in the do-notation. The issues related to let-generators and the need for segmentation are 
discussed next, followed by a detailed description of the translation algorithm. We also 
provide several comments on the integration of the new do-notation into Haskell. 



7.1 Introduction 

Recursive declarations are ubiquitous in the functional paradigm. While fixed-point op- 
erators provide a solid framework for reasoning about and understanding recursion, they 
are hardly suitable for practical programming tasks. For instance, compare: 

let sum n = if n == 0 then 0 else n + sum (n — 1) in sum 10 

to its non- recursive equivalent: 

let sum = fix (A/.An. if n == 0 then 0 else n + / (n — 1)) in sum 10 

Clearly, the use of fix makes the definition much harder to read, especially for beginning 
programmers. The situation gets worse with mutually recursive bindings. 

As we have briefly mentioned in Section |1.31 a similar problem arises in the framework 
of value recursion. Rather than using explicit calls to mfix, we would like to have a 
complementary binding construct, providing syntactic support for value recursion. In the 

x The material in this chapter is based on a paper that appears in the Haskell Workshop '02 [19]. 
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context of Haskell, an extension to the do-notation allowing recursive bindings seems to 
fit the bill. To illustrate, we will revisit the circuit modeling example from Section 1.21 
This time, we will model a simple counter, one that increments its output by 1 at each 
clock tick. The count goes back to 0 whenever the reset line goes high: 



MUX 



DELAY 0 



+ 1 



By extending the Circuit class (see Section 1.2} ) with multiplexers and monadic lift 
functions, we can model this circuit monadically as follows: 

counter :: Circuit m =4> Sig Bool — > m (Sig Int) 

counter reset = mfix (A~(next, inc, out, zero). 

do next <— delay "zero" 0 inc 
inc <- liftl "addl" (+1) out 
out <— mux reset zero next 
zero <- liftO "zero" 0 
return (next, inc, out, zero)) 
^= A (next, inc, out, zero), return out 

As we have argued in Section 1.21 the monadic implementation has numerous advan- 
tages. Syntactically, however, it carries a lot of baggage, making it hard to understand and 
maintain. (Note that binders can be arbitrary patterns in general, as in 11 Just x <— / x" , 
making the situation even worse.) As pointed out by Launchbury et al. |49J , and as we 
have outlined in Section 1.31 what we need is a recursive counterpart of the do-notation, 
allowing us to write simply [49]: 

counter reset = do next <— delay "zero" 0 inc 

inc <- liftl "addl" (+1) out 
out <— mux reset zero next 
zero «- liftO "zero" 0 
return out 

eliminating the explicit call to mfix. Note that this description of the circuit follows the 
diagram given above almost literally. The translation we will introduce in this chapter 
will handle such recursive definitions automatically, without bothering programmers with 
the details of the necessary plumbing. 
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7.2 The basic translation and design guidelines 

For clarity, we refer to the recursive version of the do-notation as the mdo-notation, and 
write mdo-expressions using the keyword mdo.@ Whenever we refer to the do-notation, 
we mean the currently available notation in Haskell that does not allow variables to be 
bound recursively. 

Inspired by the counter circuit example of the previous section, one might naively 
translate mdo-expressions as follows: 

infix (\~BV. do pi <— e% 

mdo pi <— ei 

''" Pn i C n 

P Tl ^ ^ 71 

return BV) 

>= XBV. e 

where BV stands for the tuple consisting of all variables occurring in patterns p\...p n . 
The lazy match, obtained by ~, is essential in avoiding strictness problems. 

However, there are a number of problems raised by the schema above. First of all, 
do-expressions in Haskell can use let-generators to introduce polymorphic bindings for 
pure expressions [68]. It is not clear how such bindings can be integrated into this trans- 
lation. Similarly, ordinary do-expressions can bind identifiers repeatedly, later bindings 
shadowing earlier ones. When bindings can be recursive, shadowing becomes problem- 
atic. Furthermore, the use of a single mfix to handle recursion over the entire body of an 
mdo-expression may induce poor termination properties whenever the right-shrinking laws 
fails (see Section 17.2.21 ) — intuitively, recursion should only be performed over generators 
that depend on each other cyclically, leaving the rest untouched. Finally, we would like 
to address these issues within the boundaries of the "syntactic-sugar" approach. That is, 
the translation should produce only valid (well-formed and well-typed) Haskell code. This 
approach keeps the extension simple, providing a smooth transition. 

To summarize, the basic design guidelines for the mdo-notation are: 

• Syntactic agreement with the do-notation: Programmers familiar with the do- notation 
should have no trouble using the recursive version. 

• Semantic agreement with the do-notation: To the extent possible, valid do-expressions 
should also be valid mdo-expressions, with their meanings preserved. 

• Segmentation: Calls to mfix should be isolated to recursive segments only, leaving the 
non-recursive parts out of the fixed-point computation. As we will see, segmentation 



2 The closest we can get to /xdo using ASCII. 
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is essential because extending the scope of recursion can give poorer results for those 
monads that fail to satisfy the right shrinking property. 

• Pure syntactic sugar: The translation should only produce well-formed and well- 
typed Haskell code. 

In the remainder of this section, we address these issues, refining the basic translation 
scheme as we go along. 

7.2.1 Let generators 

The do-notation of Haskell allows let-generators, with the following translation [68]: 
do let pi = e_z let pi = ei 

Pn — &n Pn — 6 n 

stmts in do stmts 

The variables bound in p\ . . .p n can be polymorphically typed. In mdo-expressions, these 
variables should be visible throughout the entire body as well, suggesting the translation: 

mfix (\~BV. do stmts i 

mdo stmts i 

let pi = ei 

let pi = ei 

Pn — C n 

Pn = e n 

stmts 2 

stmts 2 

return BV) 

»= XBV. e 

where the variables bound in p\. . .p n will appear in BV as well. Unfortunately, the 
resulting code is not guaranteed to be well-typed. To illustrate, consider: 

mfix (\~{z, y, /). 
mdo z <— / 2 y do z ^ f 2 y 

y <- f 'a' z y <- / 'a' z 

let / x _ = return x let / x _ = return x 

return (/ y z, f z y) return (z, y, /)) 

»= AO, y, f). return (/ y z, f z y) 

Since / is A-bound, it becomes monomorphically typed, making its use at two different 
types illegal. In fact, the situation is even worse: Referring to the schematic translation 
above, let-bound variables in patterns p\ . . . p n will have monomorphic types over stmts\ 
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and e, while they will retain their polymorphic typings over stmts2 and e% . . . e n . This 
situation is quite bizarre. Unfortunately, there is no easy solution to this problem. Since 
the tuple BV is A-bound, the variables that appear in it will be monomorphically typed 
when we attempt to type check the body of the do-expression and the final expression e. 

How should we deal with this problem? Clearly, it is unacceptable to ban let-generators 
completely because they are quite useful in practice. (Requiring let-bound variables to 
be visible only in the textually following generators would also be wrong.) An alternative 
is to go slightly beyond Haskell 98, using records with polymorphically typed fields [40]. 
Rather than using tuples, we can package the arguments into a record with polymorphic 
fields, retaining the polymorphic typings of let-bound variables. However, the resulting 
translation is overly complicated (as we need to perform type inference during the trans- 
lation), making it hard to formalize and automate |T7] . One might also argue that we 
can go beyond the "syntactic-sugar" approach, i.e., let the translation produce ill-typed 
code, provided we can come up with special typing rules for mdo-expressions. We will not 
pursue this option here, however, in order to be able to keep the translation as simple as 
possible. (We will return to this point in Section 17.3.31 ) 

The solution we adopt is to require let bindings to be monomorphic in mdo-expressions. 
That is, let becomes just a syntactic sugar within mdo, translated as: 



let pi 



BV <— return (let pi = ei 



Pn — 

in BV) 



where BV is the tuple corresponding to the variables bound in p\ . . .p n . This idea easily 
extends to more complicated forms of function definitions as well. For instance: 



mdo let / [] =0 

/ (x:xs) = 1 + / xs 
return (/ [1,2,3], / []) 



mdo / <— return (let / [ 

/ (x:xs) 
in /) 

return (/ [1,2,3], / []) 



0 

1 + / xs 



Note that we do not commit to a specific monomorphic type for /. As long as / is used 
consistently at a single monomorphic type, the translation will be well-typed. 

We expect this restriction to be negligible in practice. Such polymorphic let-generators 
are hardly ever used in practice, and experience suggests that there is almost always an 
obvious way to rewrite the required polymorphic bindings using an explicit let-expression, 
avoiding the whole problem. Therefore, we believe that the simplicity of this design far 
outweighs any generality that might be obtained by more complicated translation schemas. 
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Remark 7.2.1 It might help programmers if monomorphic bindings were visually dis- 
tinguishable from polymorphic ones. In a recent paper, Hughes argues that the syntax 
of let-expressions should be extended to allow monomorphic bindings, suggesting the use 
of the symbol : = to differentiate them from polymorphic ones |34J . If this idea ever gets 
adopted in Haskell, let-generators in mdo-expressions can be restricted to use : = as well, 
emphasizing the fact that they will be monomorphically typed. 

7.2.2 Segmentation 

Consider the following mdo-expression, which creates two infinite lists consisting of l's 
and 2's respectively, and its translation: 



mdo putStr "all Is" 

ones <— return (1 : ones) 
putStr "all 2s" 
twos <— return (2 : twos) 
putStr "done" 



mfix (A ~(ones, twos). 

do putStr "all Is" 

ones <— return (1 : ones) 
putStr "all 2s" 
twos <— return (2 : twos) 
return (ones, twos)) 
~^*= X(ones, twos). putStr "done" 



The resulting code is quite unsatisfactory. The only recursion we need is in independently 
computing the lists ones and twos, suggesting a segmented translation of the form: 

do putStr "all Is" 

ones <— mdo ones <— return (1 : ones) 

return ones 
putStr "all 2s" 

twos <— mdo twos <— return (2 : twos) 
return twos 

putStr "done" 

where the inner mdo-expressions will further be translated accordingly. This process is 
analogous to the handling of ordinary let-expressions in Haskell, where mutually dependent 
bindings are grouped together to enhance types of bound variables [68] • In our case, all 
variables are A-bound, i.e., monomorphic, so typing is not an issue. However, we still 
need segmentation to avoid the unwanted interference from trailing computations. As an 
example, let 

checkSingle :: [Int] — > 10 () 
checkSingle [x] = putStr "singleton" 
checkSingle _ = putStr "not— singleton" 
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and consider the following translation:^ 

fixIO (Xxs. do xs <— return (1 : xs) 

mdo xs <— return (1 : xs) 

checkSingle xs 

checkSingle xs 

return xs) 

return () 

3= Xxs. return () 

Intuitively, we expect this mdo-expression to print "not-singleton" , as the value of xs 
should simply be the infinite list of l's. Alas, the translation will diverge! The reason 
is simply that the pattern matching in checkSingle is too strict for the computation to 
proceed, failing the match immediately. However, with segmentation, we will get the code: 

do xs <— fixIO (Xxs. return (1 : xs)) 
checkSingle xs 
return () 

which will happily print "not-singleton" , avoiding the unintended interference. Interest- 
ingly, if the final "return ()" is omitted, the original translation will work as well, since the 
call to checkSingle will be the final expression, automatically pushed outside of the mfix 
loop. Just adding "return ()" should not change the result, pointing out the need for seg- 
mentation. Note that this problem will arise whenever right shrinking fails (Section |2.7.2} h 
which is the case for many practical monads of interest. (See Corollary 13.1.71 ) 

7.2.3 Shadowing 

The current syntax of do-expressions allows variable names to be bound repeatedly, later 
bindings shadowing earlier ones. One can accommodate such bindings in the mdo-notation 
as well, by appropriately renaming them. As a design choice, however, we reject this pos- 
sibility. Although shadowing might be convenient at times, it is also a constant source of 
bugs. Since bound variables are visible throughout the entire body in an mdo-expression, 
allowing repetitions is much more likely to cause confusion.@ Therefore, we disallow shad- 
owing in mdo-expressions. (This design choice also implies that the scoping rules for 
mdo-expressions are the same as those for let and where expressions, providing a consis- 
tent view of scoping in Haskell's binding constructs, both pure and monadic.) 

3 As we will see in Chapter [8J the library function fixIO :: (a — * 10 a) — > 10 a is the value recursion 
operator for Haskell's 10 monad [20] . 

4 In a similar vein, it can be argued that repetitions should not have been allowed in the do-notation 
either. List comprehensions become especially horrible: / x — [x | x <— [x .. z+5], x <— [x .. £+10]] is 
a confusing (yet legal) Haskell function. 
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7.3 Translation of mdo-expressions 

We now present an algorithm to translate mdo-expressions to core Haskell. 
7.3.1 Preliminaries 

In the following discussion, we assume that let-generators are already de-sugared into their 
return equivalents, as described in Section 17.2.11 We use the meta- variable p to range over 
patterns, v over variables, and e over expressions. 

Definition 7.3.1 (Defined variables.) A generator p <— e defines the variables that 
appear in the pattern p. If the generator is of the form e, i.e., without any binding 
patterns, then it defines no variables. An mdo-expression m defines a variable v, if v is 
defined in a generator of m. 

Definition 7.3.2 (Used variables.) A defined variable v is used in a generator p <— e if 
v occurs free in e. (And similarly when there is no binding pattern p.) 

Definition 7.3.3 (Recursive variables.) Let m be an mdo-expression, and v be a used 
variable of m. Let g be the generator that defines v. The variable v is recursive if it is 
either used by g itself, or by a generator of m that appears textually before g. 

Remark 7.3.4 Every defined variable comes from a distinct generator, due to the no- 
repetition requirement. Furthermore, only defined variables can be used, and only used 
variables can be recursive. That is, for an arbitrary mdo-expression, we have: 

Recursive Variables C Used Variables C Defined Variables 

Definition 7.3.5 (Dependent generators.) A generator g is dependent on a textually 
following generator g', if 

• g' defines a variable that is used by g, 

• or, g' textually appears in between g and g" , where g is dependent on g". 

Remark 7.3.6 The second condition in the above definition can be considered as interval 
closure. Note that, unlike a usual let-expression, we cannot reorder the generators: Order 
does matter in performing side effects. Hence, if a generator is dependent on another, we 
are forced to package them together with all the generators in between. 
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Definition 7.3.7 (Segments.) A segment of an mdo-expression is a minimal sequence 
of generators such that no generator of the sequence depends on an outside generator. As 
a special case, although it is not a generator, the final expression in an mdo-expression is 
considered to form a segment by itself. 

Remark 7.3.8 To compute the segments, it suffices to start with the first generator of an 
mdo-expression, and search for the last generator that it depends on. If such a generator 
exists, we add all the generators up to and including it to the segment. This process is 
repeated for each and every one of the generators in the segment, until we cannot add any 
new generators. Once a segment is found, the very next generator starts a new segment. 
Note that the number of segments is bounded above by the number of generators in the 
mdo-expression, plus one for the segment corresponding to the final expression. 

Definition 7.3.9 (Free variables of a segment.) Let m be an mdo-expression, v be a 
defined variable, and s be a segment of m. We say that v is free in s if (i) v appears 
free in the right hand side of a generator of s, and (ii) v is defined in a segment textually 
preceding s. 

Definition 7.3.10 (Exported variables of a segment.) A variable that is defined in a 
segment is exported if it is free in any of the textually following segments. 

7.3.2 The translation algorithm 

We describe the algorithm step by step using the following schematic running example: 



mdo {a b} <— {c d} so 

{e} - {/} 

id} <- {h} s 2 

if} - {«} s 3 

{i j} «- {» e} s 4 

{j g k} s 5 



where {v\ . . . v n } stands for a pattern that binds the variables v± . . . v n on the left hand side 
of a generator, and for an expression whose free variables are v± . . . v n on the right hand 
side. Note that the actual patterns or expressions are not important for our purposes. For 
instance, the generator S3 uses the variable a, and defines /. Generator S2 defines g, but 
does not use h, since h is not defined in this expression. For our purposes, it is nothing 
but a constant. Similar remarks apply to the variables c, d, and k as well. 
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Segmentation step: Starting with the first generator, form the segments as described 
in Remark 7,3.81 

To perform this step, we will need the defined (Di) and used variables (Ui) of each 
generator Sj. Luckily, for our running example, these sets are obvious: 

D 0 = {a,b} D 1 = {e} D 2 = {g} D 3 = {/} D 4 = {i,j} D 5 = 0 
C/ o = 0 U 1 = {f} U 2 = % U 3 = {a} U i = {i,e} U 5 = {j,g} 

To compute the segments, we start with so- Since so does not use any variables, it 
cannot depend on other generators, i.e., it forms a segment by itself. The next generator 
to consider is si, which uses the variable /. Since / is defined by S3, we have to package 
everything in between, i.e., s±, S2, and S3 together. Since none of them depends on S4 or 
S5, we stop the iteration, forming our second segment. It is easy to see that S4 and S5 
form the next two segments by themselves. Hence, we obtain: 

5*0 = {s 0 }, Si = {si,s 2 ,s 3 }, S 2 = {sa}, S3 = {s 5 } 

Analysis step: For each segment Si do the following: For each variable v defined in the 
segment, determine whether it is recursive (Definition |7.3.3 [) . Collect all recursive variables 
of the segment S, in the set Ri. If Ri is empty, this segment does not need fixed-point 
computation, leave it untouched. If Ri is not empty, compute the exported variables of 
the segment, Ei, and mark this segment as recursive for future processing. Returning to 
our example, we have: 

#0 = 0 Ri = {/}, E 1 = {e,g} R 2 = {i}, E 2 = {j} R 3 = 0 

Since only R\ and R 2 are non-empty, we mark S\ and S 2 as recursive; other segments 
are left untouched. (Note that the last segment can never be recursive.) 

Translation step: At this point, we are left with a number of segments, some of which 
are marked recursive by the previous step. For each marked segment, create the tuples 
ET and RT corresponding to the sets E and R. (If E is empty, ET will be the empty 
tuple.) Create and add a brand new variable v to the tuple RT. Then, form the generator: 

ET <- mfix (X~RT. do 

v <— return ET 
return RT) 
^= XRT. return v 



where the dotted lines are filled with the generators of the segment. 
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Note that segments that are marked recursive by the previous step are turned into 
a single generator, while non-recursive segments are left untouched.^ Returning to our 
example, we create the following generator for Si: 

(e, g) <- mfix (A~(/, v). do {e} «- {/} 

{9} - {h} 

{/} - {«} 
v <— return (e, g) 
return (/, v)) 
^= A(/, v). return v 

and the following for 

j <— mfix v). do {i j} <— {i e} 

v <— return j 
return (i, v)) 
A(z, w). return v 

Finalization step: Now, concatenate all segments and form a single do-expression out 
of them. For our example, we obtain: 

do {a b} <— {c d} 

(e, g) <- m/w; (A~(/, «). do {e} <- {/} 

{9} - {M 

{/} - {«} 

v <— return (e, 5) 

return (/, w)) 
^= A(/, u). return v 
j <- m/ix (A~(i, u). do {z j} <- {j e} 

v <— return j 
return (i, v)) 
3= A(i, w). return v 

{j 9 k} 

Remark 7.3.11 If there are no recursive bindings present to start with, the algorithm 
we have described will just leave the input untouched (except for replacing the keyword 
mdo by do). That is, the left shrinking property is automatically applied by the algorithm 
to get rid of unnecessary calls to mfix. (See Section 12.31 ) 



Depending on the sets E and R, several other improvements are possible in forming the required 
generator. For instance, if E is a subset of R, then we do not need a new variable. We skip a detailed 
discussion of these improvements here, as they are not essential for the translation. 
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Desugaring step: Now we are left with a non-recursive do-expression, and we can apply 
the standard translation to replace the do with explicit 2*='s, completing the transla- 
tion [68] . 

7.3.3 Type checking mdo-expressions 

To accommodate for the overloading of the name mfix, we simply add the following type 
class to Haskell: 

class Monad m MonadFix m where 
mfix :: (a — > m a) — > m a 

Intuitively, an mdo-expression is well-typed if its translation produces a well-typed 
Haskell expression. In order to perform type-inference, a type judgement of the form: 

V \- ei : m Ti V h pi : Tj f h e : m t 
r h mdo {pi <— a} e : m t 

suffices, with the side condition that m must belong to the MonadFix class. In this rule, V 
is obtained by extending V with the variables defined in the given mdo-expression. Each 
such variable is assigned a monomorphic type variable to begin with. (For simplicity, we 
assume all generators have the form p <— e.) The only special care is needed in handling 
let-generators, which can be typed similarly to normal let-expressions. To ensure that 
let-bound variables are monomorphic, it suffices to leave out the generalization step in the 
type inference algorithm for let-bound variables | |17l W\\ . 

As we have promised in Section 7.2.11 let us reconsider the typing of let-generators, 
aiming to find a solution that would allow polymorphic bindings. In fact, it is arguable 
that we should have a more liberal scheme, where normal bindings can be polymorphic as 
well. For instance, there is no reason why the following expression should be ill-typed: 

poly :: Maybe ([Bool], [Int]) — ill— typed 

poly = do nil <— return [ 

return (True : nil, 1 : nil) 

However, poly is not a well-typed Haskell expression, since the binding to nil is re- 
quired to be monomorphic. Of course, we cannot allow polymorphic typings arbitrarily, 
as illustrated by the infamous ML-typing problem [93] , coded here in Haskell: 

do rf <— newSTRef (Xx. x) 
writeSTRef rf (Xx. x + 1) 
/ «- readSTRef rf 
return (f True) 
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Following the previous example, we might think that rf might be assigned the type 
Va. STRef s (a — > a), which leads to disaster. So, it seems that the maybe monad is 
mild enough that generalization is acceptable, but the state monad is not. It is beyond 
the scope of our current work to investigate exactly when one might allow generalization, 
but we conjecture that it is safe to do so in the following two cases: 

• For any variable, provided the underlying monad is completely definable in Haskell, 
and not built on top of one of the internal state or 10 monads, 

• Or, variables bound by the let-generators, regardless of the underlying monad. 

Since checking for the first condition seems to be rather expensive, we might settle 
for allowing generalization in let-bound variables only, which coincides with the treatment 
of let-generators in the current do-notation. (Such a solution would be similar to ML's 
value restriction, where only "syntactically distinguishable" values are typed polymorphi- 
cally [93].) Of course, a more detailed study is needed before such an approach can be 
adopted. We leave the exploration of this idea for future work. 

7.4 Current status and related work 

The mdo-notation is implemented both by the Hugs interpreter |37j and the GHC com- 
piler [26]. Details on these implementations can be found on the web |75] . 

Predating our work, the need for recursive bindings in the do-notation was also dis- 
cussed in the framework of Nordlander's O'Haskell language, a concurrent, object-oriented 
extension to Haskell [65]. O'Haskell extends the do-notation with a variety of new features. 
With regard to recursion, O'Haskell provides a special keyword fix, providing a way to 
specify a block of generators with mutual dependencies. The translation for fix-blocks is a 
simpler version of ours: No segmentation is performed and let-generators are not allowed. 
The translation seems to permit shadowing, but that appears to be an oversight, rather 
than a conscious design decision. The addition of the fix keyword to the do-notation in 
O'Haskell arose from practical programming needs; the syntax and the translation was 
not designed to meet a general need. 

Paterson's arrow-notation supports recursive bindings as well, provided the underlying 
arrow comes equipped with a loop operator [66] . (See Section 16.4.1 for a discussion of 
arrows and loop operators.) Similar to O'Haskell, mutually dependent generators are 
explicitly marked, using the keyword rec. No segmentation is performed on recursive 
blocks. Currently, let-generators are not supported in the arrow-notation, but the addition 
of such bindings seems straightforward. We note that all variables become A-bound after 
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the translation in the arrow-notation, forcing monomorphic types. Hence, regardless of the 
support for recursive bindings, let-generators will suffer from the monomorphism problem 
in the arrow-notation. 

7.5 Summary 

In this chapter, we have described an alternative translation schema for the do-notation of 
Haskell, providing syntactic support for recursive bindings. The ability to bind variables 
recursively in the do-notation is an essential feature for value recursion as it elegantly 
hides the use of explicit value recursion operators. 

Recalling the design goals we have set for the mdo-notation, we can conclude that 
our translation fulfills its purpose. To review briefly, we have aimed for syntactic and 
semantic agreement with the do-notation, segmentation for grouping minimally dependent 
sequences of statements together, and preservation of the syntactic-sugar status. Our 
translation achieves all these goals, except for syntactic agreement for a relatively small set 
of do-expressions. Since let-generators become monomorphic and shadowing is no longer 
allowed, any do-expression using these features will be rejected. However, we believe that 
neither of these restrictions will cause serious problems in practice. Also, if desired, the 
typing problem might be remedied by devising a solution along the lines we have described 
in Section 17.3.31 

It is our belief that Haskell should have just one version of the do-notation. Just like 
let-expressions, do-expressions should be capable of expressing both recursive and non- 
recursive bindings. (The type system will insist on the MonadFix instance only when 
recursive bindings are used.) However, such a change will potentially break existing pro- 
grams, due to the minor incompatibilities mentioned above. Therefore, a separate notation 
(using the keyword mdo) has been adopted for the time being, possibly switching to the 
new translation in a future version of the Haskell standard. 



Chapter 8 
The IO monad and fixIO 



The 10 monad of Haskell comes equipped with a value recursion operator, namely the 
function fixIO^ Both the 10 monad and fixIO are language primitives in Haskell, i.e., 
they cannot be defined within the language itself. Therefore, any attempt to formally 
reason about fixIO is futile without a viable semantics for computations in the 10 monad. 
Recently, Peyton Jones introduced an operational semantics based on observable transi- 
tions as a method for reasoning about I/O in Haskell [1671. In this chapter, we build on 
his framework, and show how one can model fixIO as well] 2 ! 

Synopsis. We start with a brief discussion of the operation of fixIO, showing how it 
fits within the rest of the 10 monad. We then describe a core language based on Haskell, 
with basic monadic I/O facilities. We continue by giving a layered semantics for this 
language. Finally, we show that our model of fixIO satisfies the requirements for being a 
value recursion operator with respect to our semantics. 



8.1 Introduction 

Ever since Peyton Jones and Wadler showed how monads can be used to model I/O in 
a language with non-strict semantics, monadic I/O became the standard way of dealing 
with input and output in Haskell [69]. The 10 monad in Haskell comes equipped with a 
value recursion operator, namely the function fixIO. As Achten and Peyton Jones point 
out, and as with all value recursion operators, fixIO "... allows us to manipulate results 
[of 10 computations] that are not yet computed, but lazily available^ fTj Section 4.1]. 

Unlike many other monads, the 10 monad of Haskell is built into the language, as it 
cannot be defined within Haskell itself. As a consequence, fixIO is a language primitive 

lr The function fixIO is not part of the standard Haskell library [68]. Implementations, including Hugs 
and GHC, provide it generally in the IOExts library. 

2 This chapter is based on a paper that will appear in the Journal of Theoretical Informatics and 
Applications [21J . A preliminary version of the material presented in this chapter appeared in the Fixed 
Points in Computer Science Workshop'2001 [20]. 
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as well. Given we do not have direct access to the internals of the 10 monad, how can 
we understand the operation of fixIO? Or, in general, how can we understand IO-based 
computations? Recently, Peyton Jones introduced a semantics for Haskell 10 [67], similar 
to the monadic transition systems of Gordon |27J . In such a system, 10 computations 
are viewed as sequences of labeled transitions. Each label indicates an effect observable 
in the real world, similar to those found in process calculi [61] . Peyton Jones's work 
used an embedding of a denotational semantics for the functional layer into the 10 layer. 
However, it bypassed the details of this embedding. Such an approach is fine, as long as 
one is interested in the big picture. If, on the other hand, one wants to reason about fixIO, 
it becomes necessary to be explicit about the relationship between the 10 and functional 
layers. One aim of this chapter is to bridge this gap. 

Our semantics is structured in two layers: 10 and functional. The semantics for the 
10 layer is based on the approach taken by Peyton Jones [67]. The semantics for the 
functional layer is based on the natural semantics for lazy evaluation of Launchbury [48]. 
A final set of rules precisely shows how these two layers interact with each other. It is 
this interaction that allows us to give a semantics for fixIO. (The material in this chapter 
builds directly on Peyton Jones's and Launchbury's work mentioned above. We assume 
that the reader is already familiar with these papers.) 

8.2 Motivating examples 

Although fixIO is just like any other value recursion operator we have seen so far, the fact 
that we cannot give a Haskell definition for it makes it rather mysterious. Also, the 10 
monad provides mutable variables, a feature that we will have to deal with explicitly. We 
start by considering several examples to get familiar with the operation of fixIO. 

Example 8.2.1 Our first example shows the interaction of fixIO with input operations: 

fixIO (Acs. do c <— getChar 

return (c : cs)) 

When we run this computation, a character will be read from the standard input, say 
a. Then, the computation will immediately deliver an infinite list of a's.§ We will be 
able to pull out as many characters as we wish out of this list, following the demand- 
driven evaluation policy of Haskell. There are two crucial points: (i) the action getChar 

3 Note that, by applying the left shrinking and purity properties, we can reduce this expression to 
getChar 2>= Ac. return (fix (Acs. c:cs)), guaranteeing the described behavior axiomatically. Of course, 
we have not yet established that these two properties hold for fixIO, but we will do so in Section 18.61 
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is executed only once, and (ii) the computation terminates immediately after the reading 
is done, i.e., the infinite list is not constructed prior to its demand. In other words, the 
fact that the 10 monad is strict in actions but not in values is preserved by fixIO. 

Here, we also get a feel for what fixIO provides: It provides a means for recursively 
defining values resulting from 10 computations. That is, it allows naming results of 
computations that will only be available later on. For instance, in the expression above, 
we were able to name the result of the computation as cs, before we had its value computed. 
In this sense, the semantics is similar to the semantics of the pure expression: 

let cs = 'a' : cs in cs 

which is a convenient way of writing fix (Acs. 'a' : cs), where fix is the usual fixed-point 
operator. Except, of course, in the fixIO case the character in the list is determined by the 
call to getChar, i.e., it depends on the actual input available when we run the computation. 

Example 8.2.2 Let us revisit the fudgets example given by Expression 14.351 In terms 
of fixIO, the corresponding computation is given by: 

fixIO (Ac. do putChar c 
return 'a') 

When run, this computation diverges as c is not yet available when requested by putChar. 
(Note that this behavior is in accordance with infix as discussed in Section 14.81 ) 

Example 8.2.3 Here is a Haskell expression showing the interaction of fixIO with mu- 
table variables: 

fixIO (A "(a;, _). do y <— nevuIORef x 
return (l:x, y)) 
»= A(_, I). readlORef I 
In this expression, we allocate a cell in which we store the value of the variable x, before 
we know what that value really is. The value of x, determined through the fixed point 
computation, is the infinite list of l's. The call to fixIO returns the value (which is 
discarded) and the address of the cell that stores this cyclic structure. Outside of the call 
to fixIO, we dereference the address and get back the lazily computed list of l's. Although 
this example might look superficial, it basically captures the essence of cyclic structures 
with mutable nodes. (See Section 19.41 for an example, where we use a similar idea to 
implement doubly linked circular lists in Haskell.) 

Once we describe our semantics, we will revisit these examples to see how our system 
works in practice. 
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8.3 The language 

In this section, we define a language based on Haskell 1 68] , supporting monadic 10 primi- 
tives, mutable variables, usual recursive definitions, and value recursion. 

Notation 8.3.1 We use the following naming conventions for variables: 

c € constructors 
x, y,z,w £ heap variables 
r,s,t G mutable variables 

To simplify the discussion, we syntactically distinguish between heap and mutable vari- 
ables: They are drawn from different alphabets. 

Definition 8.3.2 (Terms and values.) Terms and values are defined mutually recur- 
sively by the following grammars, respectively: 



M, N ::= x 
I V 



M N 

let x = M in N 
case M of {a Xi — > iVj} 



V ::= c xi X2 ■■■ x% 
Xx. M 

return M \ M >= N 
getChar \ putChar M 
fixIO M | update z M 
r 

newIORef M 
readlORef M 
writelORef M N 



The function update z , associated with the heap variable z, cannot appear in a valid 
input program, and it is never the result of any program either. It is only used internally, 
in giving a semantics to fixIO. We will explain its role in detail later. All other constructs 
have the same meaning and type as they do in Haskell [7]. Note that 10 actions are values 
as far as the purely functional world is concerned. 

For the purposes of this chapter, we only work with well-typed terms, and ignore the 
issues of type checking and inference. We assume that the usual Haskell rules apply to 
determine well typed terms. (Typing of Haskell programs has been discussed in detail in 
the literature [Hi [68].) Notice that return, »=, fixIO, etc., are polymorphic constants. 
As usual, let expressions provide recursive (and possibly polymorphic) bindings. 

A constructor c of arity i is treated as a function Xx which becomes 

a value of its own when fully applied. This case is captured by the first alternative in the 
definition of values, where c is assumed to have arity i. We model constants as miliary 
constructors, that is, numbers, characters, etc., are treated as constructors with zero arity. 
(As a notational hint, we will use the letter k to refer to constants.) 
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Remark 8.3.3 It is worth noting that the grammar we gave describes the syntax for 
the reduced terms of our language rather than the concrete syntax that we will allow 
ourselves to use. In particular, we will freely use the do-notation and pattern bindings in 
A-abstractions. In each case, however, the translation to the core language will be trivial. 

Definition 8.3.4 (IO and pure terms.) A well-typed term of type IO r, for some type 
r, is called an IO term. All other terms are called pure. 

Definition 8.3.5 (Terminal values.) A value is called terminal if it has one of the 
following forms: 

• c x\ X2 ■ ■ ■ Xi, where c is a constructor of arity i, 

• Xx.M, 

• return M, 

where M is an arbitrary term in the second and third cases. 

Definition 8.3.6 (Heaps.) A heap is a finite partial function from heap variables to 
terms extended with a special black hole value •: 

T :: Heap Variables^ Terms U {•} 

A heap binding can be polymorphically typed. A black hole binding, such as z i— > •, 
indicates that the variable is known but not directly accessible. Intuitively, • is a detectable 
bottom. 

Notation 8.3.7 Although heaps are functions, we will allow ourselves to use the set 
notation freely on them: The notation x i— > M G T simply states that T maps x to M. 
The empty heap is denoted {}. The notation (T,x i— > M) denotes the heap V extended 
with a new binding x i— > M. In this case, x cannot be already bound in T, but might 
appear free in M. 

Since our language allows input operations, the meaning of a term might depend on the 
input stream it receives while being run. To accommodate this view, we have to consider 
terms and input streams together. 

Definition 8.3.8 (Input streams.) An input stream is a list of characters, not neces- 
sarily finite. 
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Notation 8.3.9 We will use the Haskell list notation to denote input streams. [] (or " ") 
denotes the empty input stream, i.e., the case when the input is exhausted. Otherwise, a 
stream is of the form (c : I), where c is a character and / is an input stream. 

Definition 8.3.10 (Term and program states.) A running program is identified by its 
program state, which consists of an input stream, a heap and a term state: 

(Terms States) P ::= M Current term 

P | (x) r Passive container 
vr.P Restriction 
We use the notation / : F : P to denote program states. 

A term state is simply the current term under consideration, together with a number 
of passive containers. A passive container (x) r represents a mutable variable named r, 
which holds a heap variable x. (We only store heap variables in these containers; the 
actual contents are stored in the heap.) Restrictions convey the scoping information for 
mutable variables. Notice that a program state contains enough information to capture a 
program in execution. 

Remark 8.3.11 To reduce clutter, we will generally skip the bits of the program state 
that are not needed in the discussion, especially when we write our rules. That is, we 
will use r : P, if the input stream is irrelevant, and similarly / : P, when the heap is 
not needed. There is no chance of confusion, however, because we only use capital Greek 
letters for heaps and never skip the term state. 

Definition 8.3.12 ( The functions bn and fn.) The function bn takes a heap and returns 
all the variables bound in it, i.e., bn(F) = {i | i h M £ T}. The function fn is defined 
for term states and heaps. Given a term state, fn returns the set of free variables in it. A 
heap variable x is free if it is not in the scope of a Ax binding. A mutable variable r is 
free if it is not in the scope of a vr binding. For a heap T, fn(F) = [j{fn(M) | x <— ► M G 
T} — bn(F). We treat fn as a variable-arity function to simplify the notation: fn(A,B) 
means fn(A) L)fn(B), and similarly for more arguments. 

Definition 8.3.13 (Slice of a heap.) The slice of a heap F, with respect to a term 
state P, written F/P, is the subset of F that is reachable from the free names of P. More 
precisely, for a given F and P, let 

So = HP) 

S i+1 = Si U (\J{fn(M) | x G Si A x » M G F}) 
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and let S = UieN^ - Then, 

T/P = {x^M\xeSAx^MeT} (8.1) 

Definition 8.3.14 (Closed program states.) A program state S : T : P is closed if 
/n(r) = 0, and fn(P) C &n(r). (Note that if the second condition is satisfied, no mutable 
variable in P can be free.) 

Definition 8.3.15 [Type of a program state.) Let S : V : P be a closed program state, 
and let M be the term associated with P. We say that S : V : P has type r, and write 
(S 1 : r : P) :: r, when M has type r when typed in the heap V. 

Definition 8.3.16 (Terminal program state.) A program state S : V : P is terminal if 
the term associated with P is terminal (Definition 18.3.51 ) . 

8.4 Semantics 

We describe the semantics of our language in layers. The 10 layer takes care of input- 
output and manages mutable variables. The functional layer handles pure computations. 
A final set of rules regulate the interaction between these two layers. 

Given a term, we need to be able to extract the part that is going to be executed next. 
We use contexts to guide this search: 

Definition 8.4.1 (Execution Contexts.) Execution contexts are described by the fol- 
lowing grammar: 

(Execution Contexts) E ::= [•] 

| E»= M 

An execution context is a term with one hole, where the hole itself is filled with a term. 
The notation E[M] denotes the context E filled with the term M. An empty context is 
one where there are no ^='s, as captured by the first alternative. Otherwise, the context 
is non-empty, i.e., it is some 10 action followed by others.0 If the context is empty, the 
term filling the context might be pure. 

8.4.1 IO layer 

Figure [8T1 gives the transition rules for the 10 layer. A rule is a (possibly labeled) transition 
from a program state to another. The label '!c' indicates that the character c is printed 

4 Other authors use the term evaluation context for this concept [23] . We prefer the term execution, 
since a non-empty context can only be filled by an IO action which is going to be executed next. 
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E[putChar c] 
(c : /) : E[getChar] 



E[return ()] (PUTC) 
I: E[return c] (GETC) 



E[return N ^ M] — > E[M N] 
r £ fn(E[newIORef M\) A x <£ bn(T) 



(LUNIT) 

(NEWIO) 



T : E[newIORef M] — > (V, x h-> M) : vr.(E[return r] \ (x) r ) 

E[readIORef r] | (x) r — > E[return x] \ (x) r (READIO) 



V i bn(T) 



T : E[writeIORef r N]\ (x), 



(T,y i— > N) : E[return ()] | (y), 



(WRITEIO] 



z £ bn(T) 



r : E[fixIO M] — ► (T,z 
(V,z i ^ •) : E[update z M] - 



(FIXIO) 



») : E[M z »= update, 
(r, z^ M) : E[return z } ( UPDATE) 



Figure 8.1: Semantics: 10 layer 

on standard output, and the one labeled '?c' indicates that the next character from the 
input stream (which happens to be c) is consumed.^ 

To simplify the notation, we use a couple of conventions in writing our rules (which 
are going to be formalized in Section 18.4.41) . Rather than a verbal explanation, we will 
consider several illustrative examples: 

Example 8.4.2 Consider the program state 

"ab" : T : getChar^ putChar 

for some heap T. The term state consists of the single term getChar ^= putChar. 
When we match this term to the context grammar given in Definition 18.4.11 we see that 
there are two possibilities. Either we can have the empty context, filled with the term 
getChar ~^*= putChar, or the context [•] ^= putChar, filled with the term getChar. Upon 
inspection of our rules, we see that only the second has a chance of matching a rule, 
namely GETC. Since the GETC rule requires the input stream to be of the form (c : I), 



5 Note that this is the same convention as we have used for the execution of fudgets in Section 14.81 
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we have to make sure that we have a non-empty stream. Because "ab" is not empty, the 
GETC rule is applicable. Hence, we end up with the transition: 

?a 

"ab" : T : getChar putChar - — > "b" : T : return 'a' ^= putChar 

Note that the GETC rule does not make use of the heap, hence it is not even mentioned. 
The heap is simply carried across unchanged. 

Example 8.4.3 Consider what happens when we continue the preceding example. Again, 
there are two possible choices for the context. The empty context, filled with the term 
return 'a' 3*= putChar, or the context [•] putChar, filled with the term return 'a'. 
Unlike the preceding case, however, the first choice matches the L UNIT rule, while the 
second one does not match any. Since the LUNIT rule does not constrain the input stream 
or the heap in any way, it is applicable. Hence, we end up with the transition: 

"b" : T : return 'a' »= putChar — > "b" : T : putChar 'a' 

Since PUTC rule does not make use of the input stream or the heap, it does not explicitly 
mention them. They are both simply copied. It should now be obvious that the next 
transition is: 

"b" : T : putChar 'a' "b" : V : return () 

and there are no more transitions from this state, as none of the rules match. 

Example 8.4.4 Consider the program state I :T : newIORef 5 readlORef, for some 
I and r. The only matching choice for the context is [•] ^= readlORef, with the term 
newIORef 5 filling the hole. The NEWIO rule applies. To satisfy the precondition of this 
rule, we have to pick variables r and x such that r ^ fn(newIORef 5 S= readlORef) and 
x $l bn(T). We simply pick fresh variables to satisfy these requests. Let us call them r 
and x for simplicity. We end up with the transition: 

/ : T : newIORef 5 »= readlORef 

— > / : (r, x i— > 5) : vr. (return r ~^*= readlORef \ (x) r ) 

Example 8.4.5 We will continue with the previous example. Clearly, we want to apply 
the LUNIT rule, but it is not clear how we get over the restriction vr. If we look at 
the LUNIT rule, we see that only a term in context is specified (as in all rules except 
READIO and WRITEIO). The convention we adopt in this case is the following: If a rule 
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only mentions a term in a context in the term state position, then we consider the term 
associated with the current program state and try to match it. Any remaining restrictions, 
passive containers, etc., are copied along. In this case, we obtain: 

I : (r, x 1—5- 5) : vr. (return r ^= readlORef \ (x) r ) 
— > / : (r, x I— »■ 5) : vr. (readlORef r \ (x) r ) 

Example 8.4.6 Finally we show how to handle rules that have both a term in context 
and a passive reference mentioned in their left hand sides, namely the WRITEIO and 
READIO rules. Continuing the previous example, we see that the READIO rule needs 
to be applied, which requires a term of the form readlORef r next to a passive container 
named r. In this case, our convention is the following: If a rule mentions a term in context 
next to a passive container, then a program state matches it if and only if we can show 
that the term associated with it matches the term in context, and we are next to the 
corresponding passive container. In our case, we get the following transition: 

I : (r, x I— > 5) : vr. (readlORef r \ {x) r ) 

— > I : (r, x i— > 5) : vr. (return 5 | (x) r ) 

Remark 8.4.7 The careful reader must have noticed that it is not necessarily the case 
that we will always have the required passive container positioned nicely. For example, if 
we start with the program state 

[]:{}: newIORef 0 >= Xr. newIORef 1 »= As. readlORef r 
we will end up with: 

] : {x i— > 0, y i— * 1} : vr. (vs. (readlORef r \ (y) s ) \ {x) r ) 

Clearly, we want to apply the READIOREF rule here as well. Alas, the rule does not 
match. In these cases, we will need to use structural rules, which provide means for 
transforming the program state into an equivalent one such that there is an applicable 
rule. Structural rules are covered in Section 18.4.41 

Some comments about the FIXIO rule are in order. The function fixIO is modeled 
after knot tying recursion semantics. We first create a new heap variable, called z, whose 
value is not yet known. This is achieved by binding it to •. Then, we call the function and 
pass it the argument z, and proceed normally. If the evaluation of this function needs to 
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know the value of z, the derivation will get stuck with a detected black hole. Otherwise, z 
could be passed around, stored in data structures, etc.: Note that it is just a normal heap 
variable. Once the function call completes, we update the heap variable z by the result, 
effectively tying the knot by an application of the UPDATE rule. In summary, z holds 
the value of the entire computation, which might in turn depend lazily on its own value, 
i.e., it is recursively defined. 

Although the rules of our 10 layer are quite similar to those given by Peyton Jones [ 167] , 
the following differences are worth mentioning: 

• We keep track of the input stream explicitly, rather than assuming that standard 
input will be consulted whenever a getChar is executed, 

• As in the natural semantics of Launchbury [48] , we keep track of a separate global 
heap to store values of variables, 

• Unlike Peyton Jones's semantics, our reference cells only store heap variables, rather 
than arbitrary terms. This restriction is necessary in order to model sharing implied 
by lazy evaluation. 

8.4.2 Functional layer 

Our rules for the functional layer, given in Figure |8.2 L follow Launchbury's natural seman- 
tics for lazy evaluation closely [48] . Note that none of the rules in this layer mention the 
input stream, as it is irrelevant at this layer. Also, we use the notation JJ-, rather than 
— ►, for reductions. Compared to the 10 layer, where we have a small step semantics, the 
rules in the functional layer encode a big step natural semantics. 

r : V JJ. r : V (VALUE) 
r : M JJ. A : Xy.M' (A,«)h N) : M'[w/y] JJ. 0 : V 



VAR* 



r : MN JJ. O : V 
(r,iHi):M JJ, (A,x i-> •) : V 
(r, x i— > M) : x % (A,x ^ V) : V 

(r, X\ i— > Mi ■ • ■ x n i ► M n ) : N ij. A : V 
T : let xi = Mi ■ ■ ■ x n = M n in N J| A : V 

r : M jj. A : c k x k A : M k [x k /y k ] j). 9 : V 

r : case M of {a y { MJ JJ. 9 : V 



(APP) 



[LET) 



'CASE) 



Figure 8.2: Semantics: Functional layer 
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Compared to Launchbury's natural semantics [48] , some minor differences worth men- 
tioning are: 

• We introduce a new black hole binding, 

• The APP rule is generalized to application of terms to terms, rather than terms to 
just variables. Correspondingly, we do not need to perform the normalization pass, 

• We perform renaming in the LET rule, rather than the VAR rule. 

In the ^LPPrule, we require w £ bniT). In the LET rule, we rename all bound variables 
x\ . . . x n to x\ . . . x n so that there will not be any name clashes in the heap when we do 
the additions. Similarly, the term Mi denotes the term Mj, where each occurrence of x\ 
is replaced by %{. (Similarly for N.) The VAR rule is not applicable if the variable being 
looked up is bound to • in the heap. If this case ever occurs, the derivation will simply 
terminate with failure, corresponding to a detectable black hole. 

We refrain from going into details of this layer, as such systems are rather well studied 
in the literature. The interested reader is referred to Launchbury's original exposition [48], 
and Sestoft's work on abstract machines based on such systems [78]. 

8.4.3 The marriage 

r : M JL A : k 



T : E[putChar M] — ► A : E[putChar k] 
r : M A : r 
T : E[readIORef M] — > A : E[readIORef r] 
r : M JJ- A : r 



T : E[writeIORef M N] — ► A : E[writeLORef r N 

r : M JL A : V 



(PUTCEVAL) 
(READIOEVAL) 
( WRITEIOEVAL ] 



T : E[M] — ► A : E[V] 



(FUN) 



Figure 8.3: Semantics: Marriage of layers. All these rules are subject to the side condition 
that M is not a value. 

Given separate semantics for the 10 and functional layers, we need to specify exactly 
how they interact. There are two different kinds of interaction. First, whenever we try 
to reduce a term of the form, say, putChar M, we first need to consult the functional 
layer to reduce the term M to a character. The 10 layer will then perform the output. 
(Note that the PUTC rule of the 10- layer only applies when the argument to putChar is 
a constant.) We need similar rules for readlORef and writelORef as well. The first three 
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s t H p ) (ALPHAl) 

T : vr.P = T[s/r] : vs.P[s/r] K ' 

y^fn(r,M,P)Ay^bn(T) {ALPHA2) 



(T,x^>M):P = (T,y i-> M)[x/y] : P[x/y] 
x $ bn(T) A x £ fn(F, P) 



r : P = (r, ik M) : P 



(HEAPEXT) 



P\Q = Q\P (COMM) 

P\{Q\R) = (P\Q)\R {ASSOC) 

vr.us.P = vs.vr.P (SWAP) 

r i MQ, r/Q) (JBmHra!) 



r : (i/r.P) | Q = T : i/r.(P | Q) 
Figure 8.4: Semantics: Structural rules, Part I 



rules in Figure 18.31 take care of this interaction. The second kind of interaction allows 
handling of applications, let and case expressions, and variable lookups. This interaction 
is provided by embedding the functional world into the 10 world, as modeled by the FUN 
rule. In all these rules, M is assumed to be a non-value: The functional layer is consulted 
to reduce M to a value. 



8.4.4 Structural rules 

Finally, we need a set of structural rules to shape our derivations. As discussed in Re- 
mark 18.4.71 structural rules do not perform evaluation steps as do the other rules, but 
they might be necessary in order to transform a program state to an equivalent one such 
that one of the transition rules can apply. 

The first set of structural rules, presented in Figure |8.4j state that certain program 
states are equivalent to others. As usual, we mention input streams and heaps only 
when they are relevant. The ALPHA rules state that heap and mutable variables can be 
renamed at will, i.e., we do not distinguish program states that differ only in the names 
of variables. (Substitution on heaps is defined as T[x/y] ={zi-> M[x/y] \ z i— > M € T}.) 
Note that we do not need a side condition of the form s 0 bn(T) in ALPHAl, since only 
heap variables can be bound in the heap. 

The HEAPEXT rule states that we can add new bindings, as long as they do not 
interfere with existing bindings. See Section [8.61 for an example use of this rule.@ The rules 

6 We can also add a garbage collection rule to get rid of unreachable heap variables and passive con- 
tainers. We will avoid such a rule for the sake of brevity, as it is not essential for our current purposes. 
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(} (HEAP IN) P ~^ Q (STREAMIN) 



r : P -^T :Q I:P-^I:Q 

P — Q (PAR) - (NU) 



P | R^Q \ R ' vr.P -2U vr.Q 



T:P.A:P> A : P> ^ 6 : g 6 : g = g : g 
r : P -^y E : Q 

Figure 8.5: Semantics: Structural rules, Part II. The label a ranges over empty transitions 
as well. 

COMM, ASSOC and SWAP state obvious equivalences. Finally EXTRUDE shows how we 
can manipulate the scoping of reference variables. The side condition in the EXTRUDE 
rule guarantees that no dangling references will be created. (See Example 18.5.41 for details.) 

The second set of structural rules, presented in Figure 18.51 formalize our conventions in 
applying the rules. The first four rules simply state that we can concentrate on the relevant 
bits of the derivation and add the extra bits later on. And finally, EQUIV states that we 
only need to consider program states up to equivalence when performing transitions. 

Example 8.4.8 We will reconsider the example discussed in Remark 18.4.71 Recall that 
we had the program state: 

] : {x i— ► 0, y i— > 1} : vr.(vs.(readIORef r \ (y) s ) \ (x) r ) 

By applying EXTRUDE, ASSOC, COMM, ASSOC and READIOREF rules (and by ap- 
propriate applications of the rules in Figure |8.5| to enable them), we get: 

= ] : i ► 0, y i ^ 1} : vr.(us.((readIORef r \ (y) s ) \ (x)r)) 
= ] : {x i— ► 0, y ^ 1} : vr.(us.((readIORef r \ (x) r ) | (y) s )) 
— > ] : i ► 0, y i ^ 1} : ur. (us. ((return x \ (x) r ) \ (y) s )) 

There are no matching rules for the resulting program state. We can apply structural 
rules again, but none will give us a program state where a non-structural rule can apply. 

Remark 8.4.9 One can extend = to an equivalence relation on program states, simply 
by adding rules to make it reflexive and transitive. However, the current definition of = 
given in Figure \8A\ is simply too crude to be useful for this purpose. Intuitively, we want to 
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be able to identify program states if their "observable behavior" are the same | |27l 571 ITT) . 
We leave the exploration of this idea for future work. 

8.4.5 Meaning of program states 

The meaning of a closed program state is its derivation: 

Definition 8.4.10 (Derivations.) Let / : T : P be a closed program state. The 
derivation for / : T : P is a sequence of labeled transitions, where at each step a rule is 
applied. Structural rules can be applied at any time, as long as they trigger the application 
of a non-structural rule. The derivation continues until there are no applicable rules. 

Simple inspection of our rules reveals that we have a deterministic system modulo the 
structural rules. That is, given a program state there is at most one non-structural rule 
that can apply to it. 

Definition 8.4.11 (Effect of a derivation.) The effect of a derivation is the concatena- 
tion of its transition labels. Empty transitions do not contribute to the effect. 

The effect of a program state is simply a (possibly infinite) list, where each element is 
of the form '?c' or '!c' for some character c. 

Notation 8.4.12 As usual, — >* is the reflexive transitive closure of — >. We will shorten 
multiple steps of derivations using the notation I : T : P I' : T' : P' . 

Definition 8.4.13 (Divergent and normal program states.) A closed program state 
/ : r : P is called divergent if the derivation starting from I : T : P either 

• continues indefinitely (i.e., we never run out of non-structural rules to apply), 

• or, gets stuck in a non-terminal program state (Definition 18.3.161 ) where no non- 
structural rule applies. 

Otherwise, I : T : P is called normal. 

Example 8.4.14 It is easy to come up with divergent terms. For instance, one can show 
that the derivation for: 



I : r : let loop = putChar 'a' 3> loop in loop 



(8.2) 
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diverges, since we never run out of rules to apply. However, the derivation for: 



I : r : let x = x in x 



(8.3) 



will diverge by getting stuck. The FUN rule will never fire, because there are no reductions 
for this term in the functional layer. (Notice that the first application of the VAR rule 
will result in / : (r, x \— ► •) : x, but no other rule will apply since the VAR rule is only 
applicable when the binding is not a black hole.) Similarly, a derivation can get stuck via 
the use of the FIXIO rule (which introduces a black hole binding in the heap). A final 
possibility is the application of the GETC rule when the input stream is empty. 

Lemma 8.4.15 {Derivations for normal program states.) Let I : T : P be a normal 
program state. The derivation starting at this state will take the form 



I : T : P -A* I' : A: Q 
where I' is a suffix of /. Furthermore, Q can be transformed using only the structural 



rules to the form vr.(N \ C), where N is a terminal value (Definition 18.3.51 ) , and C is a 
number (possibly zero) of parallel passive containers. The restrictions encoded by f cover 
all passive containers in C. 

Proof (Sketch.) By definition 18.4.131 our proof obligation reduces to establishing that 
Q can be transformed into the required uf.(N \ C) form. By inspection of the structural 
rules, we see that the rule EXTRUDE can be repeatedly used to move restrictions to the 
top, obtaining the required form. [ALPHA rules can be used to resolve naming conflicts, 
if any.) To see the correspondence between restrictions and the passive containers, just 
notice that they are introduced together by NEWIO, they are never removed, and all rules 
respect the scoping of v bindings. □ 

Observation 8.4.16 Note that derivations apply to both pure and 10 terms. A deriva- 
tion either diverges, or ends up with an abstraction or a saturated constructor application 
for a pure term, or with a term of the form return M for an 10 term. 

Proposition 8.4.17 (Derivations for 10 terms in contexts.) Let I : T : ^r.(E[M] | C) 
be a closed program state, where M is an 10 term. The derivation starting at this state 
will either diverge, or take the form: 



I : T : vr.(E[M] \ C) 



a 



I' : A : vr' .(^[return N]\C) 




I" : 0 : vr" .(return O \ C ) 



where /' is a suffix of /, and I" is a suffix of I'. 
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Proof By inspection of our rules, we see that if the derivation for T : vf.(E[M] \ C) 
terminates, then so must the derivation for T : ur.(M \ C). Hence, by the previous 
lemma, it must do so in the required intermediate form. The form of the final state is 
again guaranteed by the previous lemma. □ 

To be able to talk about strictness (Equation 2.1) , we need to say what _L means for 
the type 10 r: 

Definition 8.4.18 [Silent derivations.) A derivation is silent if its effect is empty. 

Definition 8.4.19 {Bottoms of 10.) A closed program state (I : T : M) :: 10 r is a 
bottom element (_L) for the type IO r, iff the derivation for I :T : M silently diverges. 

Example 8.4.20 It is easy to see that Program State 18.2 is not a _L of IO, but Program 
State [O] is. While they both diverge, the former is not silent. 

Definition 8.4.21 (Strict functions.) Let T be a heap and M be a term such that the 
program state ([ ] : T : M) :: r — > IO a is closed. M is strict, if, for all / and A D T/M, 
x ^ bn(T), the derivation for 

I : (A, i h #) : M i 

is silently divergent. 



8.5 Examples 

We revisit the examples given in Section |8.2[ and show how our semantics can handle 
them. In these examples, we will use the letters a,b, . . . to represent heap variables as 
well. To save space, we will apply the structural rules silently. 

Example 8.5.1 We will revisit Example 18.2.11 We first remove the do notation in favor 
of explicit ^='s: 

fixIO (Acs. getChar Ac. return (c : cs)) 

To reduce clutter, we will not write the input stream explicitly. We have: 



{} : fixIO (Acs. getChar ^= Ac. return (c : cs)) 
->* (FIXIO - FUN) 

{zm, a i— > z} : getChar Ac. return (c : a) ^= update z 

— > (GETC - assume input stream has ch in front) 
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n », a i— > z} : return ch ^= Ac. return (c : a) ^= update z 
— >* (LUNIT - FUN) 

a i— > z, 6 i— > c/i} : return (b : a) ^= update z 
— ► (LUNIT) 

{zm, a i— »■ z,b I— > c/i} : update z (b:a) 
— ► (UPDATE) 
{z i— > 6 : a, a h- > z, 6 i— > c/i} : return z 

The derivation terminates with a terminal program state at this point. Hence the initial 
program state is normal. The final heap contains the cyclic structure that represents the 
infinite list of c/i's: The character that was read by getChar. In case elements of this 
list are demanded in a context, the usual demand-driven rules modeled by our semantics 
would let us produce enough elements to satisfy the need. If the input stream is empty to 
start with, the derivation will simply block at the point where the GETC rule is applied, 
and wait forever, i.e., the derivation will diverge by getting stuck. 

Example 8.5.2 Showing that Example 18.2.21 diverges is fairly easy. We have: 

{} : fixIO (Ac. putChar c ^= Xd. return 'a') 
— >* (FIXIO - FUN) 
{zm, a i— > z] : putChar a S> Xd. return 'a' 

And now, we need to apply the PUTCEVAL rule to reduce the variable a to a character. 
The functional layer first reduces a to z using the VAR rule, but gets stuck at that point, 
as z is bound to • in the heap and the VAR rule does not apply anymore. 

Example 8.5.3 We now reconsider Example 18.2.31 which involves reference cells. Again, 
removing do-notation and simplifying the patterns, we get: 

fixIO {XL newIORef (fst t) »= Xy. 
return (1 : fst t, y)) 
^= Xu. readlORef (snd u) 

Since there are no calls to getChar, the input stream does not matter. That is, we will 
simply copy the same input stream through all transitions in our derivation. Therefore, 
we simply do not write it explicitly in what follows. 
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We will first consider the fixIO call. To save space, we will abbreviate newIORef to 
new and readlORef to read: 

{} : fixIO (Xt. new (fst t) 3*= Xy. return (1 : fst i, y)) 
— >* (FIXIO - FUN) 

{z t— ► •, a i— > z} : new (/si a) ^= Ay. return (1 : /si a, y) update z 
— > (NEWIO) 

{zi-ti, a i— > z, 6 i ^ /si a} : 

vr. (return r ^s= Xy. return (1 : /si a, y) 3= update z \ (b) r ) 
— >* (LUNIT - FUN) 

{zm*, o i— > z, b i— > /si a, c i— > r} : 

vr. (return (1 : /si a, c) 3*= update z \ (b) r ) 
— >* (LUNIT - UPDATE) 

{z h> (1 : fst a, c), a i— > z, &h> /si a, c^r}: ur. (return z \ (b) r ) 

When we consider the original expression, it is not hard to see that we will have: 

— ► (LUNIT - FUN) 
{z h» (1 : fst a, c) , a i— > z, 5 i— > /si a, c i— > r, d z] : 
vr.(read (snd d) \ (b) r ) 
— > (READIOEVAL) 
{z t— > (e, /), a i— > z, 6 i— > /si a, c i— > r, <i i— ► (e, /), en>l: /si a, / i— > r} : 
vr.(read r \ (b) r ) 
— ► (READIOREF) 
{z i— > (e, /), oi-»z, 6 i— > /si a, c i— > r, <i i— ► (e, /), ewl: /si a, / i— > r} : 
vr. (return b \ (b) r ) 

Now, if we chase the value of 6 in the heap, we see that we will end up with a cyclic 
structure effectively representing the infinite lists of l's, as intended. The most interesting 
step in this derivation is the application of the READIOEVAL rule. The function snd is 
a short hand for case over the pairing constructor. The VAR rule in the functional layer 
arranges for sharing, resulting in an abundance of variables in the resulting heap. Notice 
that, abusing the notation slightly, in the above derivation (1 : fst a, c) refers to a function 
application: the pairing constructor applied to the terms 1 : fst and c. In the last two 
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lines, however, (e, /) is a value, i.e., in this case, the pairing constructor applied to the 
right number of arguments. 

Example 8.5.4 This example demonstrates the importance of the side condition of the 
EXTRUDE rule. Consider: 

do j <— new 5 

k <— new j 
I <— read k 
read I 

By removing the do- notation, we get: 

new 5 new 3*= read read 

We will try to give a derivation for this expression, ignoring the side condition of the 
EXTRUDE rule. Again the input stream is irrelevant, and hence ignored: 

{} : new 5 new ^= read ^= read 

— ► (NEWIOREF) 

{x i— > 5} : uj. (return j 3*= new ^$*= read ^= read \ (x)j) 
— >* (LUNIT-NE WIOREF ) 

{i m 5, y I— > j} : vj.(vk. (return k ^= read ^= read \ (y)k) \ 
— ► (COMM) 

{a; m 5, y i— > j} : vj.((x)j j vk. (return k ^= read ^= read \ (y)k)) 
— > (EXTRUDE - incorrect application) 

{i m 5, y i ► j} : uj.((x)j) \ vk. (return k read ^= read \ {y)k) 

— >* (LUNIT - READ - LUNIT) 

{x h-> 5, y i ^ j} : uj.((x)j) j vk.(read y \ (y) k ) 
— ► (READIOEVAL) 

{x^5, y i ^ j} : uj.({x)j) \ uk.(read j \ (y) k ) 

And now we are stuck! The mutable variable j is not visible at this point. Since we 
were not careful in applying the extrude rule, we have created a dangling reference. Let 
us construct the slice when the rule is applied: 

So = {y}, Si = {y, j}, S 2 = Si = Soz 

By Equation |8.1[ the slice is: {y i— > j}. Since j G fn({y i— > j}), EXTRUDE is not 
applicable. The side condition prevents the creation of the dangling reference. 
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8.6 Properties of fixIO 

Equipped with the semantics we have presented so far, we are now in a position to look 
at the properties of fixIO. 



Strictness. Consider Equation 2.11 and let T be a heap where / is properly bound. 
Assuming / is strict (Definition |8.4.2T| ), we will have: 

I :T : fixIO f — ► / : (T, z i— > •) : / z ^= update z 

by a single application of the FIXIO rule. The current context specifies that the application 
/ z should be evaluated. By Definition 8.4.21 L the derivation will silently diverge. But 
then, by Definition 18.4.191 this divergence implies that fixIO / is _L. 

Example 8.6.1 Using if as a shorthand for case over the boolean type, consider: 

/ : {} : fixIO (Ax. if x == 0 then return 1 else return 2) 
— ► (FIXIO - FUN) 

I : {z \— > •, a i— > z} : if a = 0 then return 1 else return 2 ~^*= update? 
... detected black hole ... 

In the last step, the FUN rule is not applicable because there are no reductions for the 
current term in the functional layer. 

Example 8.6.2 Consider the following non-strict function: 

Ax. return x :: Char — > 10 Char 

Notice that it returns a computation successfully. Of course, if the result of the fixed- 
point computation is used, it will still diverge, but for a different reason: 

I : {} : fixIO (Ax. return x) ^= putChar 
— ► (FIXIO - FUN) 

J : {z i— ► •, anz}: return a ~^*= update? ~S*= putChar 
— ► (LUNIT - UPDATE - LUNIT) 

/ : {z i— ► a, a i— > z} : putChar z 
... detected black hole ... 



The last step diverges, because the VAR rule will get stuck trying to reduce z to a character. 
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Example 8.6.3 Consider the function: 

Xa. putChar 'q' ^> if a == 1 then return 1 else return 2 
which is not strict according to our semantics. Here is the derivation for it: 

I : {} : fixIO (Xa. putChar 'q' S> if a == 1 then return 1 else return 2) 
— >* (FIXIO - FUN) 
I : {z \— > •, anz}: 

putChar 'q' 3> if a == 1 then return 1 else return 2 3*= update z 
(PUTC) 

/ : {z i— > •, a i— > z} : if a == 1 then return 1 else return 2 ^= update z 
... detected black hole ... 

Before getting stuck, we see the character q printed, which is the correct behavior. 
Purity. Consider Equation 12.21 where we will use a let expression to capture fix: 

fixIO (return ■ h) = return (let a = h a in a) 
Assume T is a heap such that ([] : T : h) :: t t. On the left hand side, we have: 

I :T : fixIO (return ■ h) 
— >* (FIXIO - FUN) 

I : (r, z i — a z) : (return ■ h) a update z 
— ► (LUNIT) 

J : (r, zw*, a I— > : update z (h a) 
— ► (UPDATE) 

I : (r, z i— > h a, a i— > z) : return z 

Considering the right-hand-side, we immediately get: I : T : return (let a = h a in a). 

We should now prove that these two program states are equivalent, i.e., that the rules 
in our system cannot tell them apart. Such an argument would require a notion of program 
state equivalence that is more general than what our structural rules provide. Intuitively, 
the program states above will be considered equivalent if we can show that 

/ : (r, z i— ► h a, a t— >• z) : z = I :T : let a = h a in a 

Note that the second program state reduces to I : (r, z i— > h z) : z. Hence, the equiva- 
lence is clear provided we adopt a compaction rule that gets rid of the indirection via a in 
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the first heap. To formalize this argument, we need a precise definition of program state 
equivalence and a proof system for showing when two program states are the same. We 
leave the development of a such a system for future work. 



Left shrinking. Consider Equation |2.31 where we will refer to the computation a as q 
to avoid confusion with heap variables. For the left hand side we get: 

I : r : fixIO (Ax. q »= Xy. f x y) 
— >* (FIXIO - FUN) 
I : (r, zb«, a i— > z) : q Ay. fay ^= update z 

On the right hand side, we have I : T : q ^= Xy. fixIO (Ax. f x y). Now, if the 
derivation for q diverges, both derivations will diverge in the exact same way, that is both 
sides are equivalent. Otherwise, by Lemma 18.4.151 we will have: 



The C on the right hand side captures the passive containers that might be introduced 
in the derivation for q, along with the associated restrictions vr. Since these containers 
will get copied in exactly the same way, we do not show them explicitly in the following 
discussion. Using the HEAPEXT and EXTRUDE rules silently, the left hand side yields: 

I : (r, z i — ► •, a I— *■ z) : q 3*= Ay. fay ^= update z 
-^* (ASSUMPTION) 

I' : (A, zhi, a I— » z) : return qv ~^*= Xy. fay ^= update? 
— >* (LUNIT, FUN) 

I' : (A, ZH», a i— > z, 6 i— ► gu) : / a 6 3*= update z 

Let us look at the right hand side: 

I : r : 9 »= Ay. /ix/O (Ax. / x y) 
^U* (ASSUMPTION - LUNIT) 

I' : (A, 6h> g „) : /ix/O (Ax. / x 6) 
— >* (FIXIO - FUN) 

I 7 : (A, b i— > gw, a i— > z) : / a 6 update? 



i-.r-.q 



a 



I : A : vr .{return qv \ C) 



Hence, the left shrinking property holds for fixIO. We conclude that, with respect to our 
semantics, fixIO is a legitimate value recursion operator for the 10 monad. 
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Other properties. As pointed out in Corollary 13.1,71 neither strong sliding nor right 
shrinking properties hold for Haskell's 10 monad. Both of them fail with respect to the 
semantics we have given in this chapter as well. (Application of our rules to functions used 
in Propositions 13.1.5 and 13.1.6 suffices to show the failure in both cases.) We believe that 
sliding and nesting properties should hold, both for Haskell's 10 monad and our semantics. 
We leave the construction of proofs for these properties for future work. 

8.7 Summary 

In this chapter, we have described an operational semantics for a non-strict functional 
language extended with monadic 10, references, and value recursion, improving on our 
earlier work [20]. Our contributions are: (i) we show how a purely functional language 
and its semantics can be embedded into a language with monadic I/O primitives and 
references, (ii) we model sharing explicitly at all levels, giving an account of call by need 
in both the functional and the 10 layers, and (iii) we provide a semantics for fixIO and 
show that it is a value recursion operator. 

Our work can be extended in several ways. Addition of threads and synchronized 
variables seems to be fairly easy |67j . The difficulty, however, lies in adding support for 
asynchronous exceptions [56] . Although exceptions can be modeled nicely in the 10 layer, 
we currently do not see a complementary way of capturing them in the functional layer 
using our method. 

More work is needed in formalizing our arguments. Of the highest importance is the 
development of a notion of program equivalence, and tools for reasoning about program 
states which may contain symbolic terms. In this direction, program equivalence based 
on observational behavior seems to be the right framework [271 1 61] . 

One important issue we have side-stepped in this chapter is that of parametricity. How 
do we know that the constants of our language (i.e., return, »=, fixIO, newIORef, etc.) 
are parametric? To talk about parametricity, we first need to define what it means for 
two program states to be related. Our earlier attempts at stating and establishing para- 
metricity failed, mainly due to the lack of an appropriate notion of program equivalence. 
Pitts's work on observational equivalence and parametric polymorphism [ |71] can be used 
as a basis for such a work, although it is not immediately clear how to accommodate for 
references and input/output operations. Similarly, Launchbury and Peyton Jones discuss 
parametricity of constants for manipulating references in the context of the state monad 
of Haskell [52] , but their results are not directly applicable in our framework due to dif- 
ferences in the notion of reference variables, handling of the heap, and the additional 
complexity introduced by input/output. 



Chapter 9 
Examples 



In this chapter, we will consider a number of practical programming examples, illustrating 
the use of value recursion operators and the mdo-notation.@ 

Synopsis. Starting with the famous repmin problem, we consider applications in sorting 
networks, screen layout in GUI's, interpreters, cyclic graphs, and the implementation of 
logical variables. 



9.1 The repmin problem 

The repmin problem is concerned with the replacement of all the numbers in a binary tree 
by their minimum. The challenge is to do so in a single pass [6] CE6] • In 1984, Richard Bird 
devised a beautiful solution to this problem, exploiting laziness and cyclic definitions: 

data Tree a = L a | B [Tree a) (Tree a) deriving Show 



copy :: Tree Int — > Int — > (Tree Int, Int) 

copy (La) m = (L m, a) 
copy (B I r) m = let (V, ml) = copy I m 

(r 1 , mr) = copy r m 
in (B I' r', ml l min l mr) 



repmin :: Tree Int — > Tree Int 
repmin t = let (t', m) = copy t m in t' 

Here's an example run: 

Main> repmin (B (L 11) (B (L 2) (L 3))) 
B (L 2) (B (L 2) (L 2)) 

1 Before proceeding with the examples in this chapter, the reader may want to review our motivating 
circuit modeling example, covered in Sections 1.2 and 7.11 
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The single pass solution is achieved by the clever use of recursion in the let-expression of 
the function repmin. By virtue of the recursive binding, the function copy simultaneously 
computes and replaces all the leaves with m, the minimum value in the tree. 

Benton and Hyland take the problem one step further [5]. What if we also want 
to perform an effect, such as printing the values stored in the nodes during this single 
traversal as well? It is easy to modify copy to achieve this effect: 

copyPrint :: Tree Int — ► Int — > 10 {Tree Int, Int) 

copy Print (La) m = do print a 

return (L m, a) 

copyPrint (B I r) m = do (/', ml) <— copyPrint I m 

(r', mr) <— copyPrint r m 
return (B V r', ml l min l mr) 

But, it is not clear at all how to modify repmin accordingly. Obviously, the attempt: 

copyPrint t m ~^*= X(t', m). return t' 

is flawed, since m is no longer recursively bound! We need to tie the recursive knot with 
an appropriate value recursion operator. In this particular case, the appropriate operator 
is the one for the 10 monad, i.e., fixIO of Chapter O 

repminPrint :: Tree Int — > 10 (Tree Int) 
repminPrint t = fixIO (X~(t', m). copyPrint t m) 

^= X(t', m). return t' 

Or, using the mdo-notation: 

repminPrint :: Tree Int — » 10 (Tree Int) 
repminPrint t = mdo (t 1 , m) <— copyPrint t m 

return t' 

hiding the explicit call to fixIO, considerably improving readability. 

Note that we can accommodate arbitrary effects during the traversal of the original 
tree, as long as the underlying monad comes equipped with a value recursion operator. 
To illustrate, consider the following variation of the repmin problem, demonstrating the 
use of value recursion for the list monad (Section 14.3 J ) . Consider the data type: 

data Exp = C Int \ A Exp Exp 

representing simple arithmetic expressions formed out of integer constants and additions. 
The problem is to find all possible pair-swaps of a given expression. A swapping is defined 
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to be the exchange of any two constants, not necessarily distinct] 2 ) Solving the swap- 
pings problem is not a terribly hard task. Here, we present a particularly neat solution, 
illustrating the use of value recursion for the list monad: 

replace :: Int — > Exp — > [(Exp,Int)] 

replace x (C y) = [(C x, y)] 

replace x (A I r) = [(A I' r, y) \ (l 1 , y) <— replace x I] 

H+ [(A I r' , y) \ (r' ; y) <— replace x r] 

pairSwaps :: Exp — > [Exp] 

pairSwaps e = mdo (e', m) <— replace n e 

(e", n) <— replace m e' 

return e" 

The call replace x e creates copies of e, where each copy has one of its constants 
replaced by x. Each replaced constant is returned along with the corresponding copy. (If 
there are n constants in e, the call to replace will return n copies.) For instance: 

replace 12 => [(1, 2)] 

replace 1 (2 + 3) [(1 + 3, 2), (2 + 1, 3)] 

The function pairSwaps makes two successive calls to replace, threading the input 
expression through. The first call replaces each constant with n (yet to be computed), 
determining the respective values for m. The second call completes the swapping by 
substituting m's, and by computing the values of n needed in the first call. Each pairing 
of m and n corresponds to a possible swapping. The cyclic dependence between m and n 
achieves the required swapping quite neatly. 

Here is an example run for the input (1 + 2) +3, using appropriate functions for parsing 
and printing: 

Main> display (pairSwaps (parse "(1 + 2) + 3")) 
[(1 + 2) + 3, (2 + 1) + 3, (3 + 2) + 1, 

(2 + 1) + 3, (1 + 2) + 3, (1 + 3) + 2, 

(3 + 2) + 1, (1 + 3) + 2, (1 + 2) + 3] 

The value recursion operator used implicitly in the definition of pairSwaps is the one 
given by Equation 14.41 Recall that we have considered an infinite family candidate oper- 
ators for the list monad in Section 14.31 (see Equation 14.131 ). We have argued that these 



2 For instance, the only possible swapping of 1 is 1, while that of 1 + 2 are 1 + 2, 2 + 1, 2 + 1, and 1 + 2. 
The two 2 + l's are considered different, corresponding to the swappings of 1-2 and 2-1. It is easy to see 
that an expression with n constants will have n 2 swappings, one for each pair of constants. 
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candidates behave strangely, violating the mandatory left shrinking property. We take 
this opportunity to show that they yield weird results for the swapping problem as well. 
For instance, the use of mfixi yields: 

Main> display (pairSwapsl (parse "(1 +2) +3")) 
[(1 + 2) + 3, (2 + 1) + 3, (2 + 2) + 1, 

(2 + 2) + 3, (1 + 2) + 3, (1 + 2) + 2, 

(3 + 2) + 2, (1 + 3) + 2, (1 + 2) + 3] 

producing illegal swappings such as (2 + 2) + 1. The failure of the left-shrinking property 
causes unwanted interference when the constants are paired]"" 



9.2 Sorting networks and screen layout in GUI's 

A sorting network is a collection of comparators, connected in such a way that the output 
of the network is always the sorted permutation of its input |15J . For instance, the following 
network can sort four numbers: 
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For each comparator, the wire to its right carries the maximum of its inputs, while the 
lower one carries the minimum. In this particular example, a, b, c, and d are the inputs, 
while k,l,m, and n are the outputs. 

How can we implement a sorting network so that we not only get the values sorted, but 
also a transcript of the operations performed during sorting? We want each comparator 
unit to report on the operation it performed while sorting took place. The output monad 

3 In certain cases, the operation of the value recursion operator for the list monad can be understood in 
terms of the usual translation rules for list-comprehensions [89], using symbolic substitution for variables 
that occur recursively [181 Sections 1 and 6.3]. The details, although not terribly important, might be 
enjoyable for the curious reader, providing some more insight about the behavior of mfix for the list monad. 
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(Section 1-4.5 f ) springs to mind. We can translate the sorting network above almost literally 
into the following Haskell code: 

newtype Out a = Out (a, String) 

instance Monad Out where 

return x = Out (x, "") 

Out ~(x, s) »= / = let Out (y, s') = f x in Out (y, s s') 

instance Show a => Show (Out a) where 
show (Out (v, s)) = show v -H- s 

comp :: Int — > (Int, Int) — > Out (Int, Int) 

comp i (a, b) = Out ((a 'max' b, a l min l 6), "\nUnit " -H- show i H+ msg) 
where msg = (if a < b then ": swap: " else ": pass: ") -H- show (a, b) 



sort4 

sort4 (a, b, c, d) 



(Int, Int, Int, Int) — > Out (Int, Int, Int, Int) 

do (e, /) <— comp 1 (a, b) — unit 1 
(g, h) <— comp 2 (c, d) — unit 2 

(n, i) <— comp 3 (e, g) — unit 3 

(j, k) <— comp 4 (/, h) — unit 4 

(to, I) <— comp 5 (i, j) — unit 5 
return (k, I, m, n) 



Here is a sample run: 

Main> sort4 (23, 12, • 

(-1,2,12,23) 

Unit 1: pass: (23,12) 

Unit 2: swap: (-1,2) 

Unit 3: pass: (23,2) 

Unit 4: pass: (12,-1) 

Unit 5: swap: (2,12) 



•1. 2) 



What happens if we want to observe the output in some different order? For instance, 
we might want to see the output of the third unit after the fifth. Intuitively, it must be 
sufficient to move the third line after the fifth in the definition of sort4, obtaining: 

sort4 (a, b, c, d) 



do (e, /) <- 


comp 


1 (a, 


b) 


— unit 1 


(9, h) <- 


comp 


2 (c, 


d) 


— unit 2 


(J, k) - 


comp 


4 (/, 


h) 


— unit 4 


(m, I) <- 


comp 


5 (i, 


j) 


— unit 5 


(n, i) <- 


comp 


3 (e, 


9) 


— unit 3 


return (k 


I, m 


n) 
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Alas, this modification is illegal: The variable i is unbound when used in the fourth line. 
Luckily, value recursion fits the bill. All we need to say is that the variable i used by the 
5th unit is the one that is defined by the 3rd, which can be handled by an mdo-expression. 
As we have seen in Section 14.51 the corresponding infix is given by: 

instance MonadFix Out where 

infix f = let Out (a, s) = f a in Out (a, s) 

With this declaration and the use of the keyword mdo, sort4 will work as expected, 
delivering the output of the third unit after that of the fifth. 

A similar phenomenon occurs in GUI based programming, where the order of monadic 
actions implicitly determines the screen layout. To illustrate, consider the following simple 
example, taken from Thiemann's work on a CGI library for Haskell [85]: 

do fl <— inputField (fieldSize 10) 
f2 <— inputField (fieldSize 10) 
submitButton (someAction fl f2) 

The corresponding GUI will have two input fields side by side, followed by a submit 
button. What happens if we want to place the submit button to the left of the input 
fields? Since the ordering of the statements in the do-expression determines the position 
of the GUI elements, we would like to move the call to submitButton to the first line, 
textually preceding the calls to inputField. As Thiemann also points out, such a move 
would require the use of an mdo-expression, since the variables fl and f2 will no longer be 
visible when used as arguments to someAction. 

9.3 Interpreters 

Suppose you are designing an interpreter for a language that has let-bindings for intro- 
ducing local bindings. Operationally, the expression let v = e in b denotes the same 
expression as b, where e is substituted for all free occurrences of the variable v. The 
abstract syntax of your language might include: 

data Exp = ... \ Let Var Exp Exp 

Assuming the language is applicative, the natural choice for implementation would be 
the environment monad (Section 14.61 ). In this setting, the section of the interpreter that 
handles the let-expressions might look like: 

eval (Let v e b) = do ev <— eval e 

inExtendedEnv (v, ev) (eval b) 



128 



where inExtendedEnv simply extends the environment with the binding v \— > ev before 
passing it on. This approach yields a satisfactory implementation. 

Note that, our eval function cannot deal with recursive bindings, i.e., in the expression 
Let v e b, v is not visible in e. What happens if we lift this restriction? All we need 
is a way to extend the environment with the binding v *— > ev in the call to eval e, 
before we actually know what ev is. The following mdo-expression expresses the required 
dependency: 

eval (Let v e b) = mdo ev <— inExtendedEnv (v, ev) (eval e) 

inExtendedEnv (v, ev) (eval b) 

In contrast, consider how we might solve this problem without using value recursion. 
Assuming Vol denotes the data type for the values our language can process, and the 
following declaration of environments: 

data Env a = Env ([(Var, Val)] — > a) 

we are forced to implement recursive let-expressions as follows: 

eval (Let v e b) = Env (Xenv. let Env f = eval e 

ev = f ((v, ev) :env) 
Env g = eval b 
in g ((v, ev) :env)) 

Although it will perform the required task, this solution is hardly satisfactory. First 
of all, we had to reveal how environments are actually implemented, defeating the whole 
point of the monadic abstraction. As a result, our code will only work with that partic- 
ular implementation; switching to a different representation will require changes in the 
interpreter. The code is no longer easy to understand or maintain. 

On the other hand, our first implementation using the mdo-notation is quite simple to 
understand, concise, and not tied to any particular representation of environments. 

9.4 Doubly linked circular lists with mutable nodes 

Consider a simple implementation of doubly linked circular lists in Haskell. For this 
example, we will store a mutable boolean flag at each node, a True value indicating that 
the node is already visited in a particular traversal. We use the internal state monad to 
gain access to mutable variables [52] . The nodes in our circular lists have the following 
structure: 



newtype Node s a = N (STRef s Bool, Node s a, a, Node s a) 
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consisting of the mutable flag, the pointer to the previous node, the data item, and the 
pointer to the next node. Given two nodes b and /, a new node in between is created by 
the following function: 



newNode 
newNode b c f 



Node s a — > a — ► Node s a — > ST s (Node s a) 
do v <— newSTRef False 
return (N (v, b, c, /)) 



Here is a simple example of a circular list, and its rendering in Haskell using the 
function newNode. Note that the use of the mdo-expression is essential in expressing the 



cyclic structure] 4 ] 



11 :: ST s (Node s Int) 

11 = mdo nO <— newNode n3 0 nl 
nl <— newNode nO 1 n2 
n2 <— newNode nl 2 n3 
n3 <— newNode n2 3 nO 
return nO 



Traversing a given doubly linked list simply amounts to following the links until we 
reach a node that has been visited before: 




data Direction = Forward \ Backward deriving Eq 



traverse :: Direction — ► Node s a — > ST s [a] 

traverse dir (N (v, b, i, /)) = 

do visited <— readSTRef v 
if visited 

then return [\ 

else do writeSTRef v True 

let n = if dir == Forward then / else b 
is <— traverse dir n 
return (i:is) 

Here's a sample run: 



4 A more traditional technique would rely on creating dummy initial link values for at least one of the 
nodes, and explicitly overwriting them when the rest of the structure is created. This "clunky" approach 
is often seen in the formation of cyclic objects in imperative languages, such as Java. Perhaps an mdo-like 
construct could help there also. 
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Main> runST (11 »= traverse Forward) 
[0,1,2,3] 

Main> runST (11 »= traverse Backward) 
[0,3,2,1] 

The inverse function that takes a non-empty list and constructs a doubly linked circular 
list out of its elements further illustrates the use of value recursion: 

encircle :: [a] — > ST s (Node s a) 

encircle (x:xs) = mdo c <— newNode I x f 

ifi 0 *~ encircle' c xs 

return c 

encircle' :: Node s a — > [a] — > ST s (Node s a, Node s a) 

encircle' p [ = return (p, p) 

encircle' p (x:xs) = mdo c <— newNode p x f 

(/; 0 ^~ encircle' c xs 

return (c, I) 

We have: 

Main> runST (encircle "hello world" »= traverse Backward) 
"hdlrow olle" 

Main> runST (encircle "hello world" »= traverse Forward) 
"hello world" 

Similar techniques might be useful in the functional implementation of graph algo- 
rithms as well |45J . In general, programs manipulating stateful objects with cyclic de- 
pendencies can benefit from value recursion. For instance, Nordlander shows how to use 
value recursion to express layered networking protocols in the context of his O'Haskell 
language [65], Section 4.2]. 

9.5 Logical variables 

In a tutorial paper on monads and effects, Benton, Hughes and Moggi suggest the following 
exercise on programming with monads [4, Exercise 55]: 

Prolog provides so-called logical variables, whose values can be referred to 
before they are set. Define a type LVar and a monad Logic in terms of ST, 
supporting operations: 
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newLVar 
readLVar 
writeLVar 



Logic s (LVar s a) 
LVar s a — > a 
LVar s a — > a — > 



Logic s () 



where s is again a state-thread identifier. The intention is that an LVar should 
be written exactly once, but its value maybe read beforehand, between its 
creation and the write — lazy evaluation is at work here. Note that readLVar 
does not have a monadic type, and so can be used anywhere. 

Clearly, we will need to use value recursion in implementing newLVar, allowing us to 
access the value of a logical variable before it is actually set. There is a small problem, 
however. How do we determine the scope of a logical variable, i.e., how do we make it 
available to the rest of the computation? We solve this problem by using the continuation 
monad transformer, a clever trick suggested to us by John Hughes.@ Using this idea, the 
Logic monad looks like: 



A logical variable is nothing but a value and a pointer to it. To read, we simply project 
the value. To write, we update the mutable cell: 



data Logic s a = Logic {unL :: forall r. (a — > ST s r) 



ST s t} 



instance Monad (Logic s) where 
return a = Logic (Xk. k a) 

Logic f g = Logic (Xk. f (Xa. unL (g a) k)) 



newtype LVar s a = LVar (STRef s a, a) 



readLVar 

readLVar (LVar (_, v)) 



= v 



LVar s a 



a 



writeLVar 

writeLVar (LVar (r, _)) a 




The magic that makes logical variables work is hidden in newLVar: 



newLVar :: Logic s (LVar s a) 
newLVar = Logic (Xk. mdo r 



newSTRef (error "unbound LVar!") 
k (LVar (r, v)) 
readSTRef r 



a 



v 



return a) 



An alternative would be to use the type newLVar :: {LVar s a — > Logic s i) -> Logic s b, requiring 
the user to explicitly specify the scope, as in: newLVar (Xv. . . .). In that case, the ST monad itself would 
serve as the Logic monad, without any need for continuations. 
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Here is how newLVar works. We allocate a new mutable variable, r, and form the pair 
(r, v), where v is the value that will eventually be stored in r. This pair is passed to 
the continuation k, representing the remainder of the computation, i.e., the scope of the 
new logical variable. Before returning the result of this call, we simply read the mutable 
variable r, determining the actual value of v. (Note that the computation represented by k 
is expected to call writeLVar on the newly created logical variable, setting its final value.) 
The mdo-expression implicitly uses the function fixST, the value recursion operator for 
Haskell's internal state monad (see Section 14.4]) . The final bit of machinery we need is a 
simple run method to extract values: 

runLogic :: (forall s. Logic s r) — > r 
runLogic f = runST (unL f return) 

Here are a some simple examples demonstrating the use of LVar's: 

t2 



tl = do v <— newLVar 

let val = readLVar v 
return val 



do v <— newLVar 
let val = [0, 6 

writeLVar v 42 
return val 



readLVar v] 



t3 = do v <— newLVar 

let vl = readLVar v 

writeLVar v 43 

let v2 = readLVar v 

writeLVar v 42 

let v3 = readLVar v 

return (vl , v2, v3) 

We have: 



t4 = do s <— newLVar 
c <— newLVar 
let sVal = readLVar s 
cVal = readLVar c 
writeLVar s "test" 
writeLVar c '1' 
return (cVal : sVal) 



Main> runLogic tl : : Int 
Program error: unbound LVar! 
Main> runLogic t3 
(42,42,42) 



Main> runLogic t2 
[0,6,12,18,24,30,36,42] 
Main> runLogic t4 
"ltest" 



In tl, we never write to v, hence its value is left undefined. All calls to writeLVar 
except the last will be ignored] 6 ] as demonstrated by t3. Finally, t( shows that we can use 
variables with different types in the same computation. 

Claessen and Ljunglof show how one can use logical variables to embed a typed func- 
tional logic programming language in Haskell |13j . Similar to our implementation, they 



6 Of course, every call to writeLVar will be performed when the computation is run, in the given order. 
However, all calls to readLVar will return the last value written, regardless of their order. For all practical 
purposes, logical variables behave as constants, whose values can be used before they are set. 
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use the ST monad to get access to typed mutable variables. However, they only allow 
access to logical variables inside their version of the Logic monad. (In our terms, their read- 
LVar function has the type LVar s a —* Logic s a.) It might be interesting to combine 
their work with ours, allowing logical variables to be used anywhere, and hence providing 
a more flexible embedding. We leave the exploration of this idea for future work. 

9.6 Summary 

In this chapter, we illustrated the use of value recursion operators and of the recursive 
do-notation. We admit that some of our examples might seem a little contrived, with 
the notable exception of the circuit modeling example of Section 1.21 However, it is our 
hope that readers will be able to relate these examples to their own work, spotting further 
applications for value recursion. Here are some common cases to watch for: 

• Programs dealing with data flow equations. Assuming the underlying model is 
monadic, any feedback loop or a cyclic dependency would signal the need for a 
value recursion operator to tie the recursive knot. Our circuit modeling example is 
an instance of this problem. 

• Stateful objects with mutual dependencies. Again, if a monadic interface is used, 
mutual dependencies will require the use of value recursion. Our implementation 
of doubly linked circular lists, and the network programming example in O'Haskell 
(see Section 1 10. 1 1 for a brief discussion) are examples of this kind. 

• Programs that combine several phases and use recursion to eliminate multiple traver- 
sals of data structures, similar to the repmin problem of Section 19.11 If any one of 
the eliminated phases require monadic effects, value recursion becomes the tool for 
expressing the required cyclic dependence. 

• Monadic programs where a particular ordering of effects forces us to use variables 
that will only become available later, similar to the sorting networks or GUI design 
examples of Section 19.21 



Chapter 10 
Epilogue 



In this thesis, we have studied the interaction between two fundamental notions in pro- 
gramming languages: Recursion and effects. As we have seen, cyclic definitions in the 
presence of monadic effects can be understood in terms of value recursion operators, whose 
behavior can be characterized by means of a number of equational properties. It is our 
belief that these properties capture the essence of the interaction satisfactorily. Of course, 
the extent to which our axiomatization is successful will only be determined by practice. 
Our properties could be deemed appropriate if they rule out useless definitions of value 
recursion operators, and admit only those that are meaningful in practical programs. It is 
still too early to come to a decisive conclusion in this regard, but we hope that our work 
will be useful for both researchers and practitioners, especially as monads become more 
and more pervasive in functional programming. 

We conclude our exposition of value recursion by briefly reviewing the related work, 
and pointing out some future research opportunities. 

10.1 Related work 

The interaction between recursion and shared computations has been extensively studied 
by Hasegawa [32], [33] . Sharing is a commutative effect, i.e., the order of computations does 
not matter .0 As we have explored in the first part of Chapter [6} recursion in commutative 
monads can be understood in terms of traces in symmetric monoidal categories. Hasegawa 
shows that giving a trace over a cartesian closed category is the same as giving a fixed-point 
operator for it (see Theorem 16.2.4} ). This result is remarkable, as it provides an escape 
from the usual domain-theoretic view, increasing the level of abstraction considerably. As 
Hasegawa himself points out, however, when the underlying effect is non-commutative, we 
can no longer stay in the monoidal world. 

lr rhink of a recursive let-expression in Haskell. The order of bindings is irrelevant; equations can be 
swapped around without changing the result. 
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Paterson introduced loop operators for handling value recursion in arrows [36, 166] . 
Although Paterson notes that some of the axioms are too strong for many practical cases, 
he does not weaken his axiomatization to accommodate accordingly. On the syntactic 
side, Paterson's work introduced a convenient notation for programming with arrows in 
Haskell, providing support for recursive bindings as well. However, rather than letting the 
translation figure out recursive segments as we do in the mdo-notation, Paterson prefers 
using an explicit keyword, rec, asking the programmers to mark recursive blocks explicitly. 
(The rec keyword is modeled after O'Haskell's handling of recursive bindings, reviewed 
below.) 

Building on our initial paper of monadic fixed-points [18] , Benton and Hyland take 
Hasegawa's work one step further by generalizing the notion of trace to premonoidal 
categories [5J. (It turns out that Benton and Hyland's axiomatization and Paterson's 
work on arrows are essentially the same, although developed independently and presented 
in slightly different contexts.) Similar to Paterson's loop axioms, Benton and Hyland's 
axiomatization is too strong for many monads as well. As we have seen in the second half 
of Chapter \6\ their sliding and right tightening laws are simply not satisfiable in many 
practical cases (see also Section 13.11 ). As a consequence, their work can explain value 
recursion for the state monad (and those monads that embed into it, such as output and 
environments), but not exceptions, lists, or the I/O monad. In general, any monad that 
is based on a sum-like data type will fail to satisfy their requirements. In any case, we 
consider Paterson and Benton and Hyland's work as an important step toward a categorical 
account of value recursion. 

Friedman and Sabry [25] approach value recursion form an entirely different angle. 
Rather than considering individual monads separately, they consider recursion itself as a 
computational effect, following an operational definition: Allocate a reference cell, evaluate 
the body, and update the cell with the result. (This process is essentially how Scheme 
models recursion, as we have briefly covered in Section 15.31 ) Since recursion is performed 
in the combined monad, it is the users' responsibility to translate original problems and 
values to and from this combined world. That is, to model value recursion in a monad m, 
they end up using a function: 

mfixM :: (STM s m a -> STM s m a) STM s m a 

where STM is the state monad transformer.^ Furthermore, all the morphisms of the 
base monad have to be lifted into this "state enriched" world as well, and this is where 



2 Note that rnfixM accepts a function from computations to computations, rather than from values to 
computations as in the case for mfix. This change of view is necessary for implementing the allocate- 
evaluate-overwrite model. 
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the interaction between particular effects and recursion has to be addressed by the user. 
Unlike us, however, they do not postulate any properties, hence it is up to the user to 
come up with correct liftings. As Friedman and Sabry observe themselves, their method 
is rather inconvenient to use from a programming perspective, compared to our mdo- 
expressions and direct handling of recursion in the given monad. Unfortunately, a similar 
comparison is not immediately possible from a theoretical point of view, as the approaches 
are fundamentally different. 

From a practical point of view, much greater similarity to our work is found in Nord- 
lander's O'Haskell language. O'Haskell is an object oriented extension of Haskell, designed 
for addressing issues in reactive functional programming |65j . One application of O'Haskell 
is in programming layered network protocols. Each layer interacts with its predecessor 
and successor by receiving and passing information in both directions. In order to connect 
two protocols that have mutual dependencies, one needs a recursive knot-tying opera- 
tion. Since O'Haskell objects are monadic, value recursion is employed in establishing 
such connections. O'Haskell adds a keyword fix to the do-notation, whose translation is 
a simplified version of our mdo-notation. The O'Haskell work, however, does not try to 
axiomatize or generalize the idea any further. 

Carlsson and Hallgren discuss a variety of loop operators in the context of their work 
on stream based programming using fudgets [ 129] . Although the intended semantics of 
their loop operators is quite similar to those of value recursion operators, the types and 
the mechanics are somewhat different. For instance, one of their operators has the type: 

loopLeftF :: F (Either a f3) (Either a 7) -> F (3 7 

which, intuitively, ties the recursive loop over a, resulting in a fudget from (5 to 7. Carlsson 
and Hallgren use loop operators only in the framework of fudgets, without generalizing to 
arbitrary monads, or studying their behavior more abstractly. 

The circuit modeling example we have seen in Section 1.2 is discussed in detail in 
Claessen's recent dissertation [12]. Although Claessen points out the need for an appro- 
priate looping combinator, he does not pursue the monadic approach any further. Instead, 
he introduces the notion of observable sharing, which is a non-conservative! 3 ] extension to 
Haskell [14] . (Briefly, observable sharing allows programmers to determine whether a cir- 
cuit component is reached via a feedback loop, solving the infinite unfolding problem.) 
Claessen argues that "...loop combinator s are unfortunate because they introduce extra 
clutter in the code that is hard to motivate" [12] . We believe that our mdo-notation 
addresses Claessen's concerns perfectly, relieving the programmers from error-prone and 



3 Since the addition of observational sharing violates referential transparency, the resulting language is 
no longer pure. That is, the law: let x — M in N = N[fix (Ax. M)/x] no longer holds. 
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cumbersome uses of explicit looping combinators. In addition, the monadic approach has 
the obvious advantage of keeping the underlying language pure, providing a nice and clean 
semantic framework. 

Turbak and Wells introduce the cycamore data type, which is aimed at simplifying 
the use of cyclic structures in declarative languages [86]. The basic idea is to associate 
each node in a cycamore with a global unique identifier, similar to our doubly linked list 
example of Section 19.41 They consider implementations in both ML and Haskell, and the 
Haskell version makes use of references in the state monad to implement unique identifiers. 
As expected, Turbak and Wells employ value recursion in order to express the required 
cyclic structure. 



10.2 Future Work 



Although we have concentrated on applications in functional programming, value recursion 
certainly makes sense in other programming paradigms as well. One future research direc- 
tion to explore is the problem of creating cyclic structures in imperative languages. Such 
structures arise quite frequently in practice. For instance, the following example presents 
an opportunity in IBM's data manipulation language for its DB2 database system [IlljJ 4 ] 

create type VDept_t as 

(name Varchar (20) ) mode db2sql; 



VDeptJ 



VPerson t 



mgr 



dept 



- VEmp_t 



create type VPerson_t as 

(name Varchar (40) ) mode db2sql; 

create type VEmp_t under VPerson_t as 

(dept Ref (VDept_t)) mode db2sql; 

alter type VDept_t 

add attribute mgr Ref(VEmp_t); 



In this example, the user creates three types: department, person, and employee. Each 
department has a name and a manager. Each person is identified by a name. Finally, each 
employee is a VPerson_t, which further has a (reference to a) particular department. While 
the create type directives for VPerson_t and VEmp_t reflect the structure correctly, the 
VDept_t type cannot be created with both of its required attributes. Clearly, the difficulty 
arises as the VEmp_t type is not yet visible when VDept_t is created. The final alter type 
directive remedies the situation in a roundabout fashion, adding the missing attribute. 



4 This example was pointed out to us by David Maier. 
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We see two opportunities with regard to our research. First of all, better syntactic 
support (along the lines of our mdo-notation) would help get rid of the final alter type 
directive, keeping the declaration of VDept_t self-contained, possibly simplifying further 
analyses. More importantly, if such declarations are ever given a monadic semantics, value 
recursion would be the right tool for modeling the cyclic dependency. Similar opportunities 
exist in other languages as well. 

On the theoretical side, we would like to see value recursion studied in a more abstract 
setting. In this regard, the trace-fixed point correspondence, as we have studied in Chap- 
ter m, seems to be the right direction to proceed. We would like to investigate the reasons 
why the axiomatization via traces turns out to be too strong, hopefully augmenting the 
theory to capture the practical aspects more precisely. For instance, it would be inter- 
esting to pin down the role of the right shrinking property precisely. As we have seen in 
Chapter \3\ right shrinking property is not satisfiable whenever the ^= operator is strict 
in its first argument, and hence a weakening of the trace-based axiomatizations seems 
inevitable. 

Several questions remain to be explored regarding the behavior of value recursion op- 
erators. For instance, we lack a reasoning principle along the lines of fixed-point induction. 
Recall that the fixed-point induction principle states that P (fix /) can be established by 
showing that P J_ A Vd.(P d => P (/ d)) holds, provided P is an admissible predicate. 
(The obvious generalization: P _L A Vd.(P d =>• P (d ^= /)) P (mfix f) is not sound, 
as it implicitly assumes an unfolding view of value recursion.) It is probably the case that 
one needs to formulate and prove a separate induction principle for each new mfix, rather 
than looking for a universal principle that would work for all cases. While our properties 
provide a framework for reasoning about terms involving mfix, such an induction principle 
might prove essential for reasoning about value recursion in general. 

Another question is the automatic construction of value recursion operators for ar- 
bitrary monads. Although we have seen many "design patterns," it is still not clear 
how to define an appropriate operator for a given monad that will satisfy our properties. 
(The continuation monad seems to be the problem child in this regard.) Although it is 
highly unlikely that a magic recipe for automatic construction of such operators exists, it 
would be nice to pin down the exact conditions under which their existence (and possibly 
uniqueness) can be guaranteed. 

The semantics we have presented in Chapter [8] for modeling monadic I/O needs some 
improvements to simplify reasoning with symbolic terms. Furthermore, we would like 
to extend our language to support more features, such as concurrency and exceptions. 
While concurrency seems relatively easy to support, it is not immediately clear how to 
extend our system to include Haskell'98 style exceptions. More importantly, it would be 
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interesting to show that the addition of monadic I/O primitives, mutable variables, and 
support for value recursion preserves the parametricity principle. Also, we would like to 
design an accompanying abstract machine semantics, which might be useful as a basis for 
constructing interpreters for similar languages. 

Whether the mdo-notation should eventually replace the current do-notation in Haskell 
is a question that will have to be answered by the Haskell community. While we believe 
that a single construct should handle both recursive and non-recursive cases, such a change 
potentially breaks existing programs, and it might be a better idea to make the switch in 
a future version of Haskell. 



Appendix A 
Fixed-point operators 



In this appendix, we briefly review fixed-point operators. Our aim is to introduce the 
terminology we use, providing pointers to the literature for details as necessary. 

In the domain theoretic semantics of programming languages, types are modeled by 
domains and functions are modeled by continuous maps. The meaning of a typical recur- 
sive declaration of the form let x = M in N is taken to be N [fix (Xx.M)/x], where 



assuming M has type a. Note that x need not be a function only, we might define recursive 
values this way as well. For instance, we have (using Haskell-like notation): 



The least fixed-point theorem states that fix f is the least fixed-point of / |76j [92] . 
That is, (i) it satisfies the fixed-point property: / (fix f) = fix f, and (ii) it is the least 
such value, i.e., for all x s.t. x = f x, we have fix f C x. We use the name fix only to 
mean this particular fixed-point operator over domains. 

The theory of fixed points is extensively studied J9J, \W[ [81] . It is neither possible, nor 
necessary for us to summarize this huge body of work here; we will simply state the results 
that are most relevant to our work. 



Property A.l (Dinaturality.) Let / :: a — > f3, g :: j3 — > a. The dinaturality^ property 
of fix states that: 



lr The term dinaturality refers to the fact that fix can be viewed as a dinatural transformation between 
certain functors [551 ED]- We will not need this level of detail in our work, so we skip the details. 





let ones = 1 : ones in ones 



fix (Xones. 1 : ones) 



fix {} ■ g)=f (fix (g ■ /)) 
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Property A. 2 (Bekic.) Let / :: a x a — > a. The Bekic property of fix states that: 
fix (Xx. fix (Xy. f (x, y))) = fix (Xx. f (x, x)) 

Or, equivalently, 

fix (Xt. fix (Xv. f (ill t, 7T 2 v))) = fix f 

where / :: a x [3 — > a x (3. It is easy to generalize to arbitrary number of variables, rather 
than just two; see Winskel's textbook for details [92] J 2 

In Chapter [61 we consider fixed-point operators in more abstract settings, i.e., without 
assuming that the underlying structure is domains and continuous maps. We assume a 
minimal acquaintance with category theory in the following discussion |2[ [70] . The basic 
structure we work with is a category C with finite products. We write 1 for the terminal 
object. The set of arrows between two objects A and B is denoted C(A, B). We will need 
the following basic definitions [331 ES] : 

Definition A. 3 A fixed-point operator is a family of functions {-)* A '■ C(A, A) — > C(l, A), 
such that for any f : A —> A, f ■ f* = f*. 

Definition A. 4 A parameterized fixed-point operator is a family of functions: 

(■)\ x : C{A x X, X)^C(A, X) 

satisfying: 

• Parameterized fixed-point property: For / : A x X — > X, / • (ic?^, /T) = 

• Naturality in A: For f : A x X ^ X and g : B ^ A, (f ■ (g x id x )) ] = / f • Q- 

Definition A. 5 A Conway operator is a parameterized fixed-point operator that further 
satisfies: 

• Dinaturality: For / : A x X -> Y and g : A x Y -> X, ( 5 • (vrf ,X , /))t = 5 ■ (id A , (f ■ 

• Diagonal property: For / : A x X x X — > X, = (/ • (icU x (ic?x, ^x)))^- 

The reader need not master these definitions in full, only a basic familiarity is sufficient. 
For the most part we will be working with fix, and using the dinaturality and Bekic 
properties given before, which are much easier to read and understand. 



2 Bekic's property appears in many different but equivalent forms in the literature [3]. The versions we 
have given here are the ones that are most suitable for our purposes. 
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Proofs 



In the following proofs, we assume true products. In the case of lifted products, special care 
must be taken to ensure that the difference between (_L, _L) and _L is not visible. The cases 
when the distinction does matter have been pointed out in the text. (See Warning 2.6.7 
as well.) 

To save space, we will shorten return to n in our proofs. Also note that we use the name 
map to refer to Haskell's fmap, i.e., map :: (a — > (3) — > m a — > m (3 for all monads m, 
defined by the equation map f m = m ^= n ■ /. 

B.l Proposition 12.5.21 

Given Equation 12.71 establishing 12.8 is easy. We have: 

mfix (X(x, _). mfix (A(_, y). f (x, y))) 
= mfix (Xt. mfix (Xu. f (tti t, TT2 u))) 
= mfix (Xt. mfix (Xu. (X(x, y). f (m x, ir 2 y)) (t, u))) 
= mfix (Xt. (A(x, y). f (tt\ x, tt2 y)) (t, t)) {Equation [277]} 

= mfix (Xt. f (ill t, 7T2 t)) 

= mfix f 

In the last step, we used the fact that (tt\ t, tt2 t) = t, which only holds for true products. 

To show the correspondence in the other direction, let A x = (x, x), and note that A 
is strict (again thanks to true products). We have: 

mfix (Xx. f (x, x)) 
= rnfix (f • A) 

= map (tti ■ A) (mfix (f ■ A)) {tt± ■ A = id} 

= map tti (map A (mfix (f ■ A))) 

= map tti (mfix (map A • /)) {slide} 
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= map 7Ti (mfix (Xx. mfix (Ay. (map A • /) (tti x, 1T2 J/)))) 
= map 7Ti (m/ix (Ax. m/ix (map A • (Ay. / (tt\ x, y)) ■ ^2))) 
= map ix\ (mfix (Xx. map A (mfix (Ay. / (wi x, y))))) 
= map ix\ (mfix (map A • (Xx. mfix (Ay. / (x, y))) • tti)) 
= (map 7Ti • map A) (mfix (Xx. mfix (Ay. / (x, y)))) 
= mfix (Xx. mfix (Ay. / (x, y))) 

In case of lifted products (Proposition 2.5.4) ), the proof proceeds similarly. The last 
step in the first proof is not applicable, but in that case we can replace the last line with 
mfix (X~(x, y). / (x, y)), which is predicted by Equation |2.10i The second implication 
follows similarly. 

B.2 Proposition 2.6.8 

mfix (X(x, y). f x Xz. 77 (z, h z (x, y))) 

= mfix (Xt. (f ■ tt\) t ^= Xz. 77 (z, h z t)) 
= mfix (Xt. (X(u, v). (f ■ 7Ti) u ^= Xz. T] (z, h z v)) (t, t)) 
= mfix (Xx. mfix (Ay. (/ • 7Ti) x ^= Xz. 77 (z, h z y))) 
= mfix (Xx. (f ■ 7Ti) x ^= Xz. mfix (Ay. r/ (z, h z y))) 
= mfix (Xx. (f • 7Ti) x ^= Xz. 7] (fix (Ay. (z, h z y)))) 
= mfix (Xx. (f • tt\) x ^= Xz. r] (z, fix (Ay. h z (z, y)))) 
= mfix f ^= Xz. 77 (z, fix (Ay. h z (z, y))) 
= mfix f ^ Xz. 7] (fix (X(x, y). (z, h z (x, y)))) 

B.3 Proposition 12.7.11 

mfix (X(x, _)./£»= Ay. 77 (q, y)) 
= mfix (map (Ay. (q, y)) • / • m) 
= map (Ay. (q, y)) (mfix (f ■ tti • (Ay. (q, y)))) 
= map (Ay. (q, y)) (mfix (Ay. / q)) 
= map (Ay. (q, y)) (/ q) 
= f q »= Ay. 77 (q, y) 



{Equation 1X8} 
{s/ic?e} 



{nest} 

{left shrink} 
{purity} 
{nest (fix)} 
{pure right} 
{nest (fix)} 



{strong sliding} 
{constant functions} 



The need for strong sliding is obvious, since otherwise we would have to require / q = 
f J- to satisfy the precedent. 
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B.4 Lemma 3.1.4 

Recall that rj is a natural transformation, that is, it satisfies the equality map h ■ rj a = 
r]p ■ h for all h :: a — > (3. Assume rj is strict at the type a, i.e., rj a _L a = _L m a , and is 
strict in its first argument. Pick an arbitrary type (3. We will show that r\p _Lg = _L m @: 

rip 1/3 
= r]p (const ±fs _L Q ) 
= (rip ■ const ±p) ± a 
= (map (const J.^) • n a ) _Lq, 
= map (const _L^) (r] a _L a ) 
= map (const _L^) _L m a 

= -L ma ^^ri ■ Const _L /3 

= J-m /3 

B.5 Proposition 3.4.2 

Given arbitrary / and g, define: 

h x 1 = f x 
h x 2 = g x 

We have: 

m/ix (Ax. f x ® g x) 

= mfix (Ax. ft, x 1 © /i x 2) 

= mfix (Ax. (77 1 Ay. h x y) @ (n 2 ^= Ay. h x y)) 

= m/ix (Ax. (rj 1 © 77 2) Ay. h x y) 

= (rj 1 © 77 2) Ay. m/ix (Ax. h x y) 

= mfix (Ax. ft 1 1) ® m/ix (Ax. h x 2) 

= mfix f © m/ix 5 

B.6 Proposition 4.3.1 

We consider each case in turn: 

14. 5j Right to left implication is immediate. From left to right, fix (f ■ head) must be 
_L, which only happens when / _L = _L. (Note that this establishes the strictness property.) 

4.6j Similar to the previous case. 

14. 71 Simple case analysis. If mfix f is _L, / is strict by 14.51 and both sides re- 
duce to _L. If mfix / is [ ], then / _L = [ ] by |4.6[ reducing both sides to _L again. 



{naturality off]} 

{assumption: rj a ± a = _L mQ } 
{definition of map } 
{assumption: ^= is left-strict} 



{Eqn. ELS } 
{left shrink} 
{Eqn. ELS} 
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Finally, if mfix f is a cons-cell, the case expression must take its second branch, i.e., 
head (mfix f) = head (fix (f ■ head)), which is exactly the right hand side by the di- 
naturality of fix. 

14. 8j Similar to the previous case, if mfix /equals _L or [ ], both sides yield _L Otherwise, 
case must take its second branch, i.e., tail (mfix /) = mfix (tail ■ /). 
4.9j Consider the test expression for case. We have: 

fix ((Ax. / x : g x) ■ head) = (Ax. / x : g x) (fix (head ■ (Ax. / x : g x))) 

= (Ax. / x : g x) (fix f) 
= f (fix f):g (fix f) 
= fix f ■ g (fix f) 

Hence, the case expression takes its second branch, yielding: 

mfix (Ax. / x : g x) = fix f : mfix (tail ■ (Ax. f x : g x)) 
= fix f : mfix g 

14. lOt We will use the approximation lemma [Tj, 138], which states that: 

(Vn. approx n xs = approx n ys) =4* xs = ys 
for arbitrary lists xs and ys. The function approx is defined as: 



approx 




:: Integer — ► [a] - 


approx 


0 xs 


= _L 


approx 


(n+1) _L 


= _L 


approx 


(n+1) [] 


= [] 


approx 


(n+1) (x:xs) 


= x : approx n xs 



We will prove: 

Vn.V/, g. approx n (mfix (Ax. / x +h g x)) = approx n (mfix f -H- mfix g) 

by induction on n, implying the required result. Base case (n = 0) is trivial. The induction 
hypothesis is: 

Vf,g. approx k (mfix (Ax. / x +h g x)) = approx k (mfix f +)- mfix g) (B.l) 

Note that the hypothesis is assumed for all / and g. This generality will be essential in 
establishing the induction step. We need to show: 

Vf,g. approx (k+1) (mfix (Ax. / x +h g x)) = approx (&+1) (mfix f +h mfix g) 

Pick two arbitrary functions /', g' :: a — > [a]. It suffices to show that: 

approx (k+1) (mfix (Ax. f x +Y g' x)) = approx (k+1) (mfix f +\- mfix g') (B.2) 
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which we establish by case analysis on /' _L. The cases _L and [ ] are immediate. By 14.51 
and |4.6[ both sides reduce to _L and approx (k + 1) (infix g'), respectively. If /' _L is a 
cons-cell, it follows that 

Vx. /' x = {head • /') x : (tail • /') x (B.3) 

Simple inspection of the definition of infix reveals that mfix f must be a cons-cell in this 
case as well. Hence, we have: 

mfix f = head (mfix /') : tail (mfix /') (B.4) 

Therefore, 

approx (k+1) (mfix (Xx. f x +\- g' x)) 
= approx (A;+l) (mfix (Xx. ((head ■ /') x : (tail ■ /') x) +\- g' x)) 
= approx (A;+l) (mfix (Xx. (head ■ /') x : ((tail ■ /') x +\- g' x))) 
= approx (A;+l) (fix (head ■ /') : mfix (Xx. (tail ■ /') x -H- g' x)) 
= fix (head ■ /') : (approx k (mfix (Xx. (tail ■ /') x -H- g' x))) 
= fix (head ■ /') : (approx k (mfix (tail ■ /') -H- mfix g')) 
= approx (k+1) ((fix (head ■ /') : mfix (tail ■ /')) -H- mfix g') 
= approx (k+1) ((head (mfix /') : tail (mfix /')) m/ix 5') 
= approx (k+1) (mfix f +\- mfix g') 

completing the proof. 



B.7 Proposition 4.9.1 

We need to show that the function mfixErrM satisfies strictness, purity and left shrinking 
properties. All cases follow from the corresponding properties of mfixM, and simple sym- 
bolic manipulation. We will only present the left shrinking case to illustrate the technique. 
To avoid confusion due to overloaded operators, we will write returnM and bindM for the 
morphisms of m, while returnErrM and bindErrM for those of Err m. 

mfixErrM (Xx. a 1 bindErrM 1 Xy. f x y) 
= {Equation |4.36[ expand bindErrM} 

mfixM (Xx. a l bindErrM l Xy. f (unErr x) y) 
= {expand bindErrM} 

mfixM (Xx. a 'bindM 1 Xy. case y of 

Ok q — > / (unErr x) q 
Err s — > returnM (Err s)) 



{Eqn. MM 
{Eqn. @J§ 
{I.H.} 

{Eqns. M\ \SM 
{Eqn. \BM 
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= {left shrinking on mfixM} 

a 'bindM 1 Xy. mfixM (Xx. case y of 

Ok q — > / (unErr x) q 
Err s — > returnM {Err s)) 
= {Proposition \2.6.2[ case is a shortcut for if} 
a 'bindM 1 Xy. case y of 

Ok q — > mfixM (Xx. f (unErr x) q) 
Err s — > mfixM (Xx. returnM (Err s)) 
= {fold down mifxErrM on the first branch, Proposition 2.6.1 on the second} 
a l bindM l Xy. case y of 

Ok q — » mfixErrM (Xx. f x q) 
Err s — > returnM (Err s) 
= {fold down bindErrM} 

a 1 bindErrM' Xy. mfixErrM (Xx. f x y) 



B.8 Proposition 6.3.5 

We will need the following two lemmas: 

Lemma B.8.1 Let T be a monad and mfix be a value recursion operator satisfying the 
right shrinking law. Let / : X -» T (5 x X) and g : 5 x X -> T B'. Then, 

mfix (A(_, x). / x ^= Xz. g z ^= Xw. r\ (w, 1x2 z)) 

= mfix (A(_, x). f x) Xz. g z ^= Xw. rj (w, 1x2 z) 

Proof Note that the first mfix is at instance B' x X, while the second is at B x X. We 
reason as follows: 

mfix (A(_, x). f x ^5= Xz. g z 3*= Xw. 77 (w, TT2 z)) 
= mfix (A(_, x). f x Xz. g z ^= Xw. rj (w, z) ^= X(p, q). 77 (p, 1x2 <?)) 

= {slide, X(p, q). (p, 1x2 q) is strict} 

mfix ((A(_, x). f x »= Xz. g z ^ Xw. r) (w, z)) ■ (X(p, q). (p, ir 2 q))) 
>= X(p, q). 77 (p, ix 2 q) 
= mfix (A(_, f). / (tt 2 t) »= Az. j z »= Aw. 7? O, z)) >= A(p, g). rj (p, tt 2 g) 
= {right shrinking} 

mfix (A(_, x). / x) ^= Az. 5 z 3*= Aw;, r? (to, 7r2 z) 



□ 
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The second lemma states a variant of Equation 13.7b 

Lemma B.8.2 Let / :: a — > m (f3,r), g :: r — » m a, where m is a commutative monad. 
Then, 

mfix (Xt. g (ir 2 t) ~^*= f) ^= 77 ■ tt\ 
= mfix (Xt. f (vr 2 t) »= Xt' . g (tt 2 t') »= Aa. 77 (tti f', a)) >= 77 • vri 

provided m/ix satisfies strong sliding and nesting. 

Proof (Sketch) Note that the first mfix is at instance /3 X r, while the second one is 
at /? x q. The proof first extends the recursion to a x (f3 x r), applies commutativity 
(Proposition [37372] ), and then gets rid of the r argument. □ 

To establish Proposition I6.3.5I we need to verify that the definition of trace as given 
by I6.21 satisfies Equations I6.14I I6.20I We consider each case in turn: 

• Left tightening ( [6.14] ): 

trace (A(a, x). g a ^= Aa'. / (a', x)) 
= Aa. mfix (X(b, x). g a 3*= Aa'. / (a', x)) ^= 77 • 7ri 
= {/e/£ shrinking on mfix} 

Aa. 5 a ^= Aa'. m/ix (A(6, x). f (a', x)) ^= 77 ■ ir\ 
= Aa. g a ^= trace f 

• Right tightening ( 16.15] ): 

trace (A(a, x). f (a, 2) ^= A(6, x). 5 6 A6'. 7/ (6', x)) 

= Aa. mfix (X(b, x). f (a, x) ^= X(b, x). g b ^= Xb' . r\ (b' , x)) ^= 77 ■ ix\ 

= Aa. mfix (X(b, x). f (a, x) ^= Xz.(g • tti) z 3= Xb' . 77 (6', 7r 2 z)) ^= 77 ■ tt\ 
= {lemma \B.8.1\} 

Aa. mfix (X(b, x). f (a, x)) ^= Xz. (g ■ tti) z 

= Aa. mfix (X(b, x). f (a, x)) ^= Xz. r] (m z) ^= Xw. g w 

= Aa. mfix (X(b, x). f (a, x)) ^= 77 ■ tti ^= g 
= Aa. trace f a ^= g 

• Sliding (157161): 

trace (A(a, x). g x ^= Ax'. / (a, x')) 
= Aa. m/ix (A(o, x). 3 x ^= Ax'. / (a, x')) r] ■ tt\ 

= Aa. mfix (Xt. g (7r 2 t) 3= curry / a) ^= 77 ■ ni 
= {Lemma B.8.2[ [ 



Aa. mfix (At. curry f a (iT2 t) 3= Xt' . g (7^ t') 

>= Ax'. T] (VTI t\ X')) »= 77 • 7T1 

= Aa. m/ix (A(o, x'). / (a, 2/) A(6, x). 5 x 

>= Ax'. ?7 (6, X')) »= 77 • TTi 

= trace (A(a, x'). / (a, x') ^= A(o, x). g x Ax'. 77 (6, x')) 

Vanishing ( 16.171) : 

irace (A(a, ()). / a »= A6. 77 (5, ())) 
= Aa. m/ix (A(6, ()). / a »= A6. 77 (6, ())) »= 77 ■ 7Ti 
= {constant functions} 

Xa. f a ^ Xb. r, (b, ()) »= 7/ • tti 
= Aa. / a »= A6. 6 

= / 

Vanishing ( 16.181 ): Let 

asc (x, (7/, 2)) = ((x, y), z) 
iasc ((x, y), z) = (x, (7/, z)) 

Then, 

trace (trace (A((a, x), y). f (a, (x, y)) »= A(6, (x, 7/)). 77 ((b, x), y))) 
= irace (trace (Xt. f (iasc t) ^= 77 • asc)) 
= trace (trace (map asc ■ f ■ iasc)) 

= trace (A(a, x). mfix (A((_, _), y). (map asc ■ f • iasc) ((a, x), y))^*= 77 
= trace (A(a, x). mfix (map asc • (A((_, _), y). f (a, (x, y)))) ^= 77 • tti) 
= {slide, asc is strict } 

trace (X(a, x). mfix (A(_, (_, y)). f (a, (x, y))) rj ■ asc ^= 77 • tti 

= Xa. mfix (A(_, x). mfix (A(_, (_, y)). f (a, (x, 7/))) 77 • iri ■ asc) 

r] • 7Ti 

= {slide, 7Ti • asc is sirici } 

Aa. m/ix ((A(_, x). mfix (A(_, (_, y)). / (a, (x, y)))) • 7Ti • asc) 

77 • (7Tl • 7Ti • asc) 

= {TTi ■ 71 1 • aSC = 7Tl} 

Aa. m/ix (A(_, (x, _)). mfix (A(_, (_, y)). f (a, (x, y)))) »= 77 • tti 
= {unnest triple} 

Xa. mfix (A(_, (x, y)). / (a, (x, y))) »= 77-711 
= trace f 
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• Superposing (I6.19J): Let asc be defined as above, 

trace (A((c, a), x). f (a, x) \(b, x). rj ((c, b), x)) 

= A(c, a), mfix (A((_, _), x). f (a, x) >= X(b, x). r] ((c, b), x)) r? • 7Ti 

= {slide} 

A(c, a), jti/kc (A(_, (_, x)). f (a, x) ^= X(b, x). r\ (c, (b, x))) 
^= r] ■ 7Ti • asc 
= {pure right shrinking} 

A(c, a), m/ix (A(6, x). / (a, x)) A(6, x). r/ (c, 6) 
= A(c, a), jti/kc (X(b, x). / (a, x)) ^ A(6', x). n b' ^= Xb. n (c, 6) 
= A(c, a), mfix (X(b, x). / (a, x)) ^= 7] ■ iri Xb. rj (c, b) 

= A(c, a), trace / a ^= Xb. rj (c, b) 

• Yanking ( E^Ol : 

trace (A(a, a'), r/ (a', a)) 
= Aa. m/ix (A(6, a'). r\ (a', a)) ^= r/ ■ n\ 
= {purity} 

Aa. r] (fix (X(b, a'), (a', a))) 77 • 7i"i 

= Aa. r/ (a, a) ^= 7/ • 7Ti 

= f] 
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